darktable 3.4/3.5 opencl slow on Windows 10

OpenCL kernels are always compiled locally, when you start darktable (and it does not find the compiled versions matching your current video driver). You should see that if you run with the -d opencl command-line argument, but only if you first delete the already compiled versions (on Linux: ~/.cache/darktable/cached_kernels_for_{name of device}_{version of driver}, e.g. for me: cached_kernels_for_GeForceGTX10606GB_46080).

Output with compilation:

0.039812 [opencl_init] opencl related configuration options:
0.039828 [opencl_init] 
0.039831 [opencl_init] opencl: 1
0.039836 [opencl_init] opencl_scheduling_profile: 'default'
0.039840 [opencl_init] opencl_library: ''
0.039843 [opencl_init] opencl_memory_requirement: 768
0.039847 [opencl_init] opencl_memory_headroom: 400
0.039849 [opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
0.039853 [opencl_init] opencl_mandatory_timeout: 200
0.039856 [opencl_init] opencl_size_roundup: 16
0.039858 [opencl_init] opencl_async_pixelpipe: 1
0.039860 [opencl_init] opencl_synch_cache: active module
0.039865 [opencl_init] opencl_number_event_handles: 1000
0.039868 [opencl_init] opencl_micro_nap: 0
0.039871 [opencl_init] opencl_use_pinned_memory: 0
0.039873 [opencl_init] opencl_use_cpu_devices: 0
0.039876 [opencl_init] opencl_avoid_atomics: 0
0.039878 [opencl_init] 
0.040083 [opencl_init] found opencl runtime library 'libOpenCL'
0.040105 [opencl_init] opencl library 'libOpenCL' found on your system and loaded
0.067237 [opencl_init] found 1 platform
0.067246 [opencl_init] found 1 device
0.067456 [opencl_init] device 0 `GeForce GTX 1060 6GB' has sm_20 support.
0.067616 [opencl_init] device 0 `GeForce GTX 1060 6GB' supports image sizes of 16384 x 32768
0.067620 [opencl_init] device 0 `GeForce GTX 1060 6GB' allows GPU memory allocations of up to 1519MB
[opencl_init] device 0: GeForce GTX 1060 6GB 
     GLOBAL_MEM_SIZE:          6077MB
     MAX_WORK_GROUP_SIZE:      1024
     MAX_WORK_ITEM_DIMENSIONS: 3
     MAX_WORK_ITEM_SIZES:      [ 1024 1024 64 ]
     DRIVER_VERSION:           460.80
     DEVICE_VERSION:           OpenCL 1.2 CUDA
0.124780 [opencl_init] options for OpenCL compiler: -w -cl-fast-relaxed-math  -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/home/kofa/darktable-master/share/darktable/kernels"
0.124873 [opencl_init] compiling program `demosaic_ppg.cl' ..
0.124915 [opencl_fopen_stat] could not open file `/home/kofa/.cache/darktable/cached_kernels_for_GeForceGTX10606GB_46080/demosaic_ppg.cl.bin'!
0.124919 [opencl_load_program] could not load cached binary program, trying to compile source
0.124922 [opencl_load_program] successfully loaded program from '/home/kofa/darktable-master/share/darktable/kernels/demosaic_ppg.cl' MD5: '873aa05f976ebda5de7eee1601037421'
0.127093 [opencl_build_program] successfully built program
0.127098 [opencl_build_program] BUILD STATUS: 0
0.127100 BUILD LOG:
0.127100 

0.127104 [opencl_build_program] saving binary
... repeated for the other kernels ...
0.160316 [opencl_init] kernel loading time: 0.0355 
0.160321 [opencl_init] OpenCL successfully initialized.
0.160323 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
0.160324 [opencl_init]          0       'GeForce GTX 1060 6GB'
0.160326 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
0.160327 [opencl_init] initial status of opencl enabled flag is ON.
0.160335 [opencl_create_kernel] successfully loaded kernel `blendop_mask_Lab' (0) for device 0
0.160339 [opencl_create_kernel] successfully loaded kernel `blendop_mask_RAW' (1) for device 0
0.160343 [opencl_create_kernel] successfully loaded kernel `blendop_mask_rgb_hsl' (2) for device 0
... repeated for the other kernels ...

If you already have them, you won’t have the compilation step, only the loading messages.

1 Like