Just for curiosity, I run the same export on my machine (which I wouldn’t call ‘lighting fast’ for today’s standards), using darktable 3.2.1:
OS: Ubuntu 20.04.1 LTS x86_64
Host: Aspire VN7-591G V1.15
Kernel: 5.4.0-52-generic
Uptime: 15 hours, 43 mins
Packages: 3583 (dpkg), 11 (flatpak)
Shell: bash 5.0.17
Resolution: 1920x1080
DE: Plasma
WM: KWin
WM Theme: Materia-Light
Theme: Breeze [Plasma], Breeze [GTK2/3]
Icons: Papirus-Light [Plasma], Papirus-Light [GTK2/3]
Terminal: konsole
Terminal Font: Hack 11
CPU: Intel i7-4720HQ (8) @ 3.600GHz
GPU: Intel 4th Gen Core Processor
GPU: NVIDIA GeForce GTX 960M
Memory: 6020MiB / 15935MiB
CPU-only:
0.575855 [dev] took 0.099 secs (0.112 CPU) to load the image.
0.610138 [export] creating pixelpipe took 0.028 secs (0.166 CPU)
0.611408 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
0.619871 [dev_pixelpipe] took 0.008 secs (0.063 CPU) processed `raw black/white point' on CPU, blended on CPU [export]
0.627932 [dev_pixelpipe] took 0.008 secs (0.041 CPU) processed `white balance' on CPU, blended on CPU [export]
0.633258 [dev_pixelpipe] took 0.005 secs (0.032 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export]
0.771329 [dev_pixelpipe] took 0.138 secs (0.724 CPU) processed `demosaic' on CPU, blended on CPU [export]
4.660575 [dev_pixelpipe] took 3.889 secs (26.570 CPU) processed `denoise (profiled)' on CPU, blended on CPU [export]
4.684545 [dev_pixelpipe] took 0.024 secs (0.024 CPU) processed `lens correction' on CPU, blended on CPU [export]
6.055441 [dev_pixelpipe] took 1.371 secs (9.217 CPU) processed `haze removal' on CPU, blended on CPU [export]
6.187599 [dev_pixelpipe] took 0.132 secs (0.332 CPU) processed `retouch' on CPU, blended on CPU [export]
6.219955 [dev_pixelpipe] took 0.032 secs (0.223 CPU) processed `exposure' on CPU, blended on CPU [export]
6.245422 [dev_pixelpipe] took 0.025 secs (0.037 CPU) processed `mask manager' on CPU, blended on CPU [export]
6.547148 [dev_pixelpipe] took 0.302 secs (2.315 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
6.582799 [dev_pixelpipe] took 0.036 secs (0.266 CPU) processed `input color profile' on CPU, blended on CPU [export]
7.065598 [dev_pixelpipe] took 0.483 secs (3.437 CPU) processed `defringe' on CPU, blended on CPU [export]
10.201755 [dev_pixelpipe] took 3.136 secs (21.815 CPU) processed `contrast equalizer' on CPU, blended on CPU [export]
10.281415 [dev_pixelpipe] took 0.080 secs (0.576 CPU) processed `sharpen' on CPU, blended on CPU [export]
10.400381 [dev_pixelpipe] took 0.119 secs (0.905 CPU) processed `color balance' on CPU, blended on CPU [export]
image colorspace transform Lab-->RGB took 0.039 secs (0.269 CPU) [filmicrgb ]
10.663881 [dev_pixelpipe] took 0.263 secs (1.948 CPU) processed `filmic rgb' on CPU, blended on CPU [export]
image colorspace transform RGB-->Lab took 0.046 secs (0.342 CPU) [bilat ]
11.427388 [dev_pixelpipe] took 0.763 secs (4.642 CPU) processed `local contrast' on CPU, blended on CPU [export]
11.645072 [dev_pixelpipe] took 0.218 secs (1.582 CPU) processed `color zones' on CPU, blended on CPU [export]
12.419915 [dev_pixelpipe] took 0.775 secs (5.607 CPU) processed `output color profile' on CPU, blended on CPU [export]
12.460058 [dev_pixelpipe] took 0.040 secs (0.299 CPU) processed `display encoding' on CPU, blended on CPU [export]
12.460200 [dev_process_export] pixel pipeline processing took 11.850 secs (80.670 CPU)
GPU+CPU:
1.129415 [dev] took 0.095 secs (0.094 CPU) to load the image.
1.162579 [export] creating pixelpipe took 0.027 secs (0.158 CPU)
1.162611 [pixelpipe_process] [export] using device 0
1.163878 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
1.172080 [dev_pixelpipe] took 0.008 secs (0.007 CPU) processed `raw black/white point' on GPU, blended on GPU [export]
1.176384 [dev_pixelpipe] took 0.004 secs (0.003 CPU) processed `white balance' on GPU, blended on GPU [export]
1.182154 [dev_pixelpipe] took 0.006 secs (0.005 CPU) processed `highlight reconstruction' on GPU, blended on GPU [export]
1.218059 [dev_pixelpipe] took 0.036 secs (0.016 CPU) processed `demosaic' on GPU, blended on GPU [export]
1.780975 [dev_pixelpipe] took 0.563 secs (0.412 CPU) processed `denoise (profiled)' on GPU, blended on GPU [export]
1.801260 [dev_pixelpipe] took 0.020 secs (0.007 CPU) processed `lens correction' on GPU, blended on GPU [export]
3.097789 [dev_pixelpipe] took 1.297 secs (1.886 CPU) processed `haze removal' on GPU, blended on GPU [export]
3.170938 [dev_pixelpipe] took 0.073 secs (0.282 CPU) processed `retouch' on GPU, blended on GPU [export]
3.191366 [dev_pixelpipe] took 0.020 secs (0.004 CPU) processed `exposure' on GPU, blended on GPU [export]
3.211700 [dev_pixelpipe] took 0.020 secs (0.003 CPU) processed `mask manager' on GPU, blended on GPU [export]
3.613545 [dev_pixelpipe] took 0.402 secs (2.268 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
3.658883 [dev_pixelpipe] took 0.045 secs (0.044 CPU) processed `input color profile' on GPU, blended on GPU [export]
4.166169 [dev_pixelpipe] took 0.507 secs (3.306 CPU) processed `defringe' on CPU, blended on CPU [export]
4.853289 [dev_pixelpipe] took 0.687 secs (0.685 CPU) processed `contrast equalizer' on GPU, blended on GPU [export]
4.907485 [dev_pixelpipe] took 0.054 secs (0.029 CPU) processed `sharpen' on GPU, blended on GPU [export]
4.928029 [dev_pixelpipe] took 0.021 secs (0.007 CPU) processed `color balance' on GPU, blended on GPU [export]
image colorspace transform Lab-->RGB took 0.032 secs (0.248 CPU) [filmicrgb ]
5.212033 [dev_pixelpipe] took 0.284 secs (1.893 CPU) processed `filmic rgb' on CPU, blended on CPU [export]
image colorspace transform RGB-->Lab took 0.008 secs (0.004 GPU) [bilat ]
5.552033 [dev_pixelpipe] took 0.340 secs (0.245 CPU) processed `local contrast' on GPU, blended on GPU [export]
5.579511 [dev_pixelpipe] took 0.027 secs (0.014 CPU) processed `color zones' on GPU, blended on GPU [export]
6.333490 [dev_pixelpipe] took 0.754 secs (5.514 CPU) processed `output color profile' on CPU, blended on CPU [export]
6.376401 [dev_pixelpipe] took 0.043 secs (0.316 CPU) processed `display encoding' on CPU, blended on CPU [export]
6.376648 [opencl_profiling] profiling device 0 ('GeForce GTX 960M'):
6.376710 [opencl_profiling] spent 0.1123 seconds in [Write Image (from host to device)]
6.376763 [opencl_profiling] spent 0.0019 seconds in rawprepare_1f
6.376812 [opencl_profiling] spent 0.0019 seconds in whitebalance_1f
6.376883 [opencl_profiling] spent 0.0019 seconds in highlights_1f_clip
6.376932 [opencl_profiling] spent 0.0074 seconds in ppg_demosaic_green
6.376978 [opencl_profiling] spent 0.0084 seconds in ppg_demosaic_redblue
6.377025 [opencl_profiling] spent 0.0011 seconds in border_interpolate
6.377072 [opencl_profiling] spent 0.0074 seconds in denoiseprofile_precondition_Y0U0V0
6.377118 [opencl_profiling] spent 0.2821 seconds in denoiseprofile_decompose
6.377163 [opencl_profiling] spent 0.0391 seconds in denoiseprofile_reduce_first
6.377209 [opencl_profiling] spent 0.0002 seconds in denoiseprofile_reduce_second
6.377259 [opencl_profiling] spent 0.0003 seconds in [Read Buffer (from device to host)]
6.377306 [opencl_profiling] spent 0.0672 seconds in denoiseprofile_synthesize
6.377351 [opencl_profiling] spent 0.0421 seconds in [Copy Image (on device)]
6.377397 [opencl_profiling] spent 0.0071 seconds in denoiseprofile_backtransform_Y0U0V0
6.377443 [opencl_profiling] spent 0.0010 seconds in blendop_set_mask
6.377489 [opencl_profiling] spent 0.0104 seconds in blendop_rgb
6.377533 [opencl_profiling] spent 0.2773 seconds in [Read Image (from device to host)]
6.377580 [opencl_profiling] spent 0.0038 seconds in hazeremoval_transision_map
6.377626 [opencl_profiling] spent 0.0693 seconds in hazeremoval_box_max_x
6.377674 [opencl_profiling] spent 0.0050 seconds in hazeremoval_box_max_y
6.377719 [opencl_profiling] spent 0.0957 seconds in hazeremoval_box_min_x
6.377765 [opencl_profiling] spent 0.0061 seconds in hazeremoval_box_min_y
6.377810 [opencl_profiling] spent 0.0057 seconds in guided_filter_split_rgb_image
6.377854 [opencl_profiling] spent 0.6577 seconds in guided_filter_box_mean_x
6.377899 [opencl_profiling] spent 0.0340 seconds in guided_filter_box_mean_y
6.377944 [opencl_profiling] spent 0.0063 seconds in guided_filter_covariances
6.377989 [opencl_profiling] spent 0.0080 seconds in guided_filter_variances
6.378033 [opencl_profiling] spent 0.0278 seconds in guided_filter_update_covariance
6.378081 [opencl_profiling] spent 0.0142 seconds in guided_filter_solve
6.378126 [opencl_profiling] spent 0.0070 seconds in guided_filter_generate_result
6.378132 [opencl_profiling] spent 0.0069 seconds in hazeremoval_dehaze
6.378143 [opencl_profiling] spent 0.0067 seconds in [Copy Image to Buffer (on device)]
6.378148 [opencl_profiling] spent 0.0002 seconds in [Write Buffer (from host to device)]
6.378151 [opencl_profiling] spent 0.0004 seconds in retouch_copy_buffer_to_buffer
6.378155 [opencl_profiling] spent 0.0002 seconds in retouch_copy_buffer_to_buffer_masked
6.378159 [opencl_profiling] spent 0.0062 seconds in retouch_copy_buffer_to_image
6.378162 [opencl_profiling] spent 0.0061 seconds in exposure
6.378165 [opencl_profiling] spent 0.0072 seconds in colorin_unbound
6.378168 [opencl_profiling] spent 0.3216 seconds in eaw_decompose
6.378170 [opencl_profiling] spent 0.0755 seconds in eaw_synthesize
6.378173 [opencl_profiling] spent 0.0019 seconds in blendop_mask_Lab
6.378176 [opencl_profiling] spent 0.0104 seconds in blendop_Lab
6.378179 [opencl_profiling] spent 0.0085 seconds in sharpen_hblur
6.378182 [opencl_profiling] spent 0.0073 seconds in sharpen_vblur
6.378184 [opencl_profiling] spent 0.0098 seconds in sharpen_mix
6.378187 [opencl_profiling] spent 0.0062 seconds in colorbalance_cdl
6.378190 [opencl_profiling] spent 0.0062 seconds in colorspaces_transform_rgb_matrix_to_lab
6.378193 [opencl_profiling] spent 0.0061 seconds in pad_input
6.378195 [opencl_profiling] spent 0.0738 seconds in gauss_reduce
6.378198 [opencl_profiling] spent 0.0399 seconds in process_curve
6.378201 [opencl_profiling] spent 0.0516 seconds in laplacian_assemble
6.378204 [opencl_profiling] spent 0.0069 seconds in write_back
6.378206 [opencl_profiling] spent 0.0127 seconds in colorzones_v3
6.378209 [opencl_profiling] spent 2.4817 seconds totally in command queue (with 0 events missing)
6.378229 [dev_process_export] pixel pipeline processing took 5.216 secs (16.973 CPU)
Some interesting points:
- your i7-6500U is more or less matched to my i7-4720hq in benchmarks for single thread loads, but here it takes twice the time to process the image. I guess the difference here is 4 threads vs. 8 threads running in parallel.
- despite being quite an old GPU, the 960M trounces the integrated Intel GPU. When people here points to the OpenCL speed-up in darktable, this is specifficaly related to discrete GPUs. Even if the Intel NEO driver can give you more speed than the CPU alone, it is no match for an AMD or NVidia card.