With OpenCL:
darktable-cli setubal.orf setubal.orf.xmp test.jpg --core -d perf -d opencl
: [dev_process_export] pixel pipeline processing took 5.563 secs (13.421 CPU)
Without:
darktable-cli setubal.orf setubal.orf.xmp test.jpg --core -d perf -d opencl --disable-opencl
: [dev_process_export] pixel pipeline processing took 13.078 secs (127.521 CPU)
In GPU-compute, your card should be about twice as fast as mine.
Radeon PRO W6600 GeForce GTX 1060
GPU Compute 9895 Ops/Sec 4322 Ops/Sec (-56.3%)
(Radeon PRO W6600 vs GeForce GTX 1060 [videocardbenchmark.net] by PassMark Software)
My OpenCL logs
2.2212 [dt_dev_load_raw] loading the image. took 0.587 secs (0.563 CPU)
2.2789 [export] creating pixelpipe took 0.055 secs (0.398 CPU)
2.2790 [dt_opencl_check_tuning] use 4808MB (headroom=OFF, pinning=OFF) on device `NVIDIA CUDA NVIDIA GeForce GTX 1060 6GB' id=0
2.2793 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
2.2934 [dev_pixelpipe] took 0.014 secs (0.065 CPU) [export] processed `rawprepare' on GPU, blended on GPU
2.2990 [dev_pixelpipe] took 0.006 secs (0.002 CPU) [export] processed `temperature' on GPU, blended on GPU
2.3266 [dev_pixelpipe] took 0.028 secs (0.023 CPU) [export] processed `highlights' on GPU, blended on GPU
2.4592 [dev_pixelpipe] took 0.133 secs (0.127 CPU) [export] processed `hotpixels' on CPU, blended on CPU
2.5928 [dev_pixelpipe] took 0.134 secs (0.143 CPU) [export] processed `demosaic' on GPU, blended on GPU
3.9984 [dev_pixelpipe] took 1.406 secs (0.866 CPU) [export] processed `denoiseprofile' on GPU with tiling, blended on CPU
4.5705 [dev_pixelpipe] took 0.572 secs (1.564 CPU) [export] processed `lens' on GPU, blended on GPU
4.6047 [dev_pixelpipe] took 0.034 secs (0.029 CPU) [export] processed `ashift' on GPU, blended on GPU
4.6263 [dev_pixelpipe] took 0.022 secs (0.017 CPU) [export] processed `exposure' on GPU, blended on GPU
4.6620 [dev_pixelpipe] took 0.036 secs (0.027 CPU) [export] processed `colorin' on GPU, blended on GPU
4.6827 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.020 secs (0.013 GPU) [channelmixerrgb]
4.7277 [dev_pixelpipe] took 0.066 secs (0.046 CPU) [export] processed `channelmixerrgb' on GPU, blended on GPU
4.8807 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.064 secs (0.657 CPU) [atrous]
5.9407 [dev_pixelpipe] took 1.213 secs (1.770 CPU) [export] processed `atrous' on GPU with tiling, blended on CPU
6.0632 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.012 secs (0.012 GPU) [colorbalancergb]
6.1078 [dev_pixelpipe] took 0.167 secs (0.146 CPU) [export] processed `colorbalancergb' on GPU, blended on GPU
6.1423 [dev_pixelpipe] took 0.034 secs (0.022 CPU) [export] processed `rgblevels' on GPU, blended on GPU
6.1713 [dev_pixelpipe] took 0.029 secs (0.020 CPU) [export] processed `sigmoid' on GPU, blended on GPU
6.3214 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.059 secs (0.645 CPU) [bilat]
7.6203 [dev_pixelpipe] took 1.449 secs (8.338 CPU) [export] processed `bilat' on CPU, blended on CPU
7.7289 [dev_pixelpipe] took 0.108 secs (0.108 CPU) [export] processed `colorout' on GPU, blended on GPU
7.7331 [resample_cl] took 0.004 secs (0.000 CPU) 1:1 copy/crop of 8065x6046 pixels
7.7505 [dev_pixelpipe] took 0.022 secs (0.017 CPU) [export] processed `finalscale' on GPU, blended on GPU
7.8418 [opencl_profiling] profiling device 0 ('NVIDIA CUDA NVIDIA GeForce GTX 1060 6GB'):
7.8418 [opencl_profiling] spent 0.5348 seconds in [Write Image (from host to device)]
7.8418 [opencl_profiling] spent 0.0026 seconds in rawprepare_1f
7.8418 [opencl_profiling] spent 0.0031 seconds in whitebalance_1f
7.8418 [opencl_profiling] spent 0.0025 seconds in highlights_initmask
7.8418 [opencl_profiling] spent 0.0033 seconds in highlights_dilatemask
7.8418 [opencl_profiling] spent 0.1928 seconds in [Write Buffer (from host to device)]
7.8418 [opencl_profiling] spent 0.0075 seconds in highlights_chroma
7.8418 [opencl_profiling] spent 0.0000 seconds in [Read Buffer (from device to host)]
7.8418 [opencl_profiling] spent 0.0063 seconds in highlights_opposed
7.8418 [opencl_profiling] spent 1.0297 seconds in [Read Image (from device to host)]
7.8418 [opencl_profiling] spent 0.0008 seconds in border_interpolate
7.8418 [opencl_profiling] spent 0.0060 seconds in rcd_border_green
7.8418 [opencl_profiling] spent 0.0107 seconds in rcd_border_redblue
7.8418 [opencl_profiling] spent 0.0074 seconds in rcd_populate
7.8418 [opencl_profiling] spent 0.0052 seconds in rcd_step_1_1
7.8418 [opencl_profiling] spent 0.0040 seconds in rcd_step_1_2
7.8418 [opencl_profiling] spent 0.0025 seconds in rcd_step_2_1
7.8418 [opencl_profiling] spent 0.0065 seconds in rcd_step_3_1
7.8418 [opencl_profiling] spent 0.0037 seconds in rcd_step_4_1
7.8418 [opencl_profiling] spent 0.0020 seconds in rcd_step_4_2
7.8418 [opencl_profiling] spent 0.0058 seconds in rcd_step_5_1
7.8418 [opencl_profiling] spent 0.0093 seconds in rcd_step_5_2
7.8419 [opencl_profiling] spent 0.0099 seconds in rcd_write_output
7.8419 [opencl_profiling] spent 0.0118 seconds in denoiseprofile_precondition_Y0U0V0
7.8419 [opencl_profiling] spent 0.4297 seconds in denoiseprofile_decompose
7.8419 [opencl_profiling] spent 0.0418 seconds in denoiseprofile_reduce_first
7.8419 [opencl_profiling] spent 0.0002 seconds in denoiseprofile_reduce_second
7.8419 [opencl_profiling] spent 0.1217 seconds in denoiseprofile_synthesize
7.8419 [opencl_profiling] spent 0.0659 seconds in [Copy Image (on device)]
7.8419 [opencl_profiling] spent 0.0119 seconds in denoiseprofile_backtransform_Y0U0V0
7.8419 [opencl_profiling] spent 0.0176 seconds in lens_vignette
7.8419 [opencl_profiling] spent 0.0550 seconds in lens_distort_bicubic
7.8419 [opencl_profiling] spent 0.0261 seconds in ashift_bicubic
7.8419 [opencl_profiling] spent 0.0169 seconds in exposure
7.8419 [opencl_profiling] spent 0.0191 seconds in colorin_unbound
7.8419 [opencl_profiling] spent 0.0269 seconds in colorspaces_transform_lab_to_rgb_matrix
7.8419 [opencl_profiling] spent 0.0150 seconds in channelmixerrgb_CAT16
7.8419 [opencl_profiling] spent 0.6065 seconds in eaw_decompose
7.8419 [opencl_profiling] spent 0.1499 seconds in eaw_synthesize
7.8419 [opencl_profiling] spent 0.0180 seconds in colorbalancergb
7.8419 [opencl_profiling] spent 0.0147 seconds in rgblevels
7.8419 [opencl_profiling] spent 0.0215 seconds in sigmoid_loglogistic_per_channel
7.8419 [opencl_profiling] spent 0.0223 seconds in colorout
7.8419 [opencl_profiling] spent 3.5489 seconds totally in command queue (with 0 events missing)
7.8419 [dev_process_export] pixel pipeline processing took 5.563 secs (13.421 CPU)
Did you see excessive tiling, or other issues in your logs? I had a little bit with my GPU (in denoiseprofile
→ denoise (profiled) and atrous
→ contrast equalizer). What’s your darktable resource setting? In another benchmark, there was quite a bit of difference (4.4 vs 6 seconds) between large and normal on my machine.