Performance issues

@Tore_Valberg
Tjena, asså!

I was sort of settled to try Centos due to Resolve.

Yes, I follow your line of reasoning — daVinci even offers a downloadable CentOS ISO with Resolve built-in. But as far as I know, it does not like Nvidia GFX at all :frowning:

There are many interesting distros, a few are even tailor-made for Ryzen CPUs.
On Phoronix you can read more about them, as well as study speed graphs.
Clear Linux surprisingly gets good marks [surprisingly because of who is behind it :-)]. But it does not like Nvidia either.

If you plan a double boot with Windows, Manjaro’s installer is a good bet…

Have fun!
Claes in Lund, Sweden

Just for curiosity, I run the same export on my machine (which I wouldn’t call ‘lighting fast’ for today’s standards), using darktable 3.2.1:

OS: Ubuntu 20.04.1 LTS x86_64
Host: Aspire VN7-591G V1.15 
Kernel: 5.4.0-52-generic 
Uptime: 15 hours, 43 mins 
Packages: 3583 (dpkg), 11 (flatpak) 
Shell: bash 5.0.17 
Resolution: 1920x1080 
DE: Plasma 
WM: KWin 
WM Theme: Materia-Light 
Theme: Breeze [Plasma], Breeze [GTK2/3] 
Icons: Papirus-Light [Plasma], Papirus-Light [GTK2/3] 
Terminal: konsole 
Terminal Font: Hack 11 
CPU: Intel i7-4720HQ (8) @ 3.600GHz 
GPU: Intel 4th Gen Core Processor 
GPU: NVIDIA GeForce GTX 960M 
Memory: 6020MiB / 15935MiB

CPU-only:

0.575855 [dev] took 0.099 secs (0.112 CPU) to load the image.
0.610138 [export] creating pixelpipe took 0.028 secs (0.166 CPU)
0.611408 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
0.619871 [dev_pixelpipe] took 0.008 secs (0.063 CPU) processed `raw black/white point' on CPU, blended on CPU [export]
0.627932 [dev_pixelpipe] took 0.008 secs (0.041 CPU) processed `white balance' on CPU, blended on CPU [export]
0.633258 [dev_pixelpipe] took 0.005 secs (0.032 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export]
0.771329 [dev_pixelpipe] took 0.138 secs (0.724 CPU) processed `demosaic' on CPU, blended on CPU [export]
4.660575 [dev_pixelpipe] took 3.889 secs (26.570 CPU) processed `denoise (profiled)' on CPU, blended on CPU [export]
4.684545 [dev_pixelpipe] took 0.024 secs (0.024 CPU) processed `lens correction' on CPU, blended on CPU [export]
6.055441 [dev_pixelpipe] took 1.371 secs (9.217 CPU) processed `haze removal' on CPU, blended on CPU [export]
6.187599 [dev_pixelpipe] took 0.132 secs (0.332 CPU) processed `retouch' on CPU, blended on CPU [export]
6.219955 [dev_pixelpipe] took 0.032 secs (0.223 CPU) processed `exposure' on CPU, blended on CPU [export]
6.245422 [dev_pixelpipe] took 0.025 secs (0.037 CPU) processed `mask manager' on CPU, blended on CPU [export]
6.547148 [dev_pixelpipe] took 0.302 secs (2.315 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
6.582799 [dev_pixelpipe] took 0.036 secs (0.266 CPU) processed `input color profile' on CPU, blended on CPU [export]
7.065598 [dev_pixelpipe] took 0.483 secs (3.437 CPU) processed `defringe' on CPU, blended on CPU [export]
10.201755 [dev_pixelpipe] took 3.136 secs (21.815 CPU) processed `contrast equalizer' on CPU, blended on CPU [export]
10.281415 [dev_pixelpipe] took 0.080 secs (0.576 CPU) processed `sharpen' on CPU, blended on CPU [export]
10.400381 [dev_pixelpipe] took 0.119 secs (0.905 CPU) processed `color balance' on CPU, blended on CPU [export]
image colorspace transform Lab-->RGB took 0.039 secs (0.269 CPU) [filmicrgb ]
10.663881 [dev_pixelpipe] took 0.263 secs (1.948 CPU) processed `filmic rgb' on CPU, blended on CPU [export]
image colorspace transform RGB-->Lab took 0.046 secs (0.342 CPU) [bilat ]
11.427388 [dev_pixelpipe] took 0.763 secs (4.642 CPU) processed `local contrast' on CPU, blended on CPU [export]
11.645072 [dev_pixelpipe] took 0.218 secs (1.582 CPU) processed `color zones' on CPU, blended on CPU [export]
12.419915 [dev_pixelpipe] took 0.775 secs (5.607 CPU) processed `output color profile' on CPU, blended on CPU [export]
12.460058 [dev_pixelpipe] took 0.040 secs (0.299 CPU) processed `display encoding' on CPU, blended on CPU [export]
12.460200 [dev_process_export] pixel pipeline processing took 11.850 secs (80.670 CPU)

GPU+CPU:

1.129415 [dev] took 0.095 secs (0.094 CPU) to load the image.
1.162579 [export] creating pixelpipe took 0.027 secs (0.158 CPU)
1.162611 [pixelpipe_process] [export] using device 0
1.163878 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
1.172080 [dev_pixelpipe] took 0.008 secs (0.007 CPU) processed `raw black/white point' on GPU, blended on GPU [export]
1.176384 [dev_pixelpipe] took 0.004 secs (0.003 CPU) processed `white balance' on GPU, blended on GPU [export]
1.182154 [dev_pixelpipe] took 0.006 secs (0.005 CPU) processed `highlight reconstruction' on GPU, blended on GPU [export]
1.218059 [dev_pixelpipe] took 0.036 secs (0.016 CPU) processed `demosaic' on GPU, blended on GPU [export]
1.780975 [dev_pixelpipe] took 0.563 secs (0.412 CPU) processed `denoise (profiled)' on GPU, blended on GPU [export]
1.801260 [dev_pixelpipe] took 0.020 secs (0.007 CPU) processed `lens correction' on GPU, blended on GPU [export]
3.097789 [dev_pixelpipe] took 1.297 secs (1.886 CPU) processed `haze removal' on GPU, blended on GPU [export]
3.170938 [dev_pixelpipe] took 0.073 secs (0.282 CPU) processed `retouch' on GPU, blended on GPU [export]
3.191366 [dev_pixelpipe] took 0.020 secs (0.004 CPU) processed `exposure' on GPU, blended on GPU [export]
3.211700 [dev_pixelpipe] took 0.020 secs (0.003 CPU) processed `mask manager' on GPU, blended on GPU [export]
3.613545 [dev_pixelpipe] took 0.402 secs (2.268 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
3.658883 [dev_pixelpipe] took 0.045 secs (0.044 CPU) processed `input color profile' on GPU, blended on GPU [export]
4.166169 [dev_pixelpipe] took 0.507 secs (3.306 CPU) processed `defringe' on CPU, blended on CPU [export]
4.853289 [dev_pixelpipe] took 0.687 secs (0.685 CPU) processed `contrast equalizer' on GPU, blended on GPU [export]
4.907485 [dev_pixelpipe] took 0.054 secs (0.029 CPU) processed `sharpen' on GPU, blended on GPU [export]
4.928029 [dev_pixelpipe] took 0.021 secs (0.007 CPU) processed `color balance' on GPU, blended on GPU [export]
image colorspace transform Lab-->RGB took 0.032 secs (0.248 CPU) [filmicrgb ]
5.212033 [dev_pixelpipe] took 0.284 secs (1.893 CPU) processed `filmic rgb' on CPU, blended on CPU [export]
image colorspace transform RGB-->Lab took 0.008 secs (0.004 GPU) [bilat ]
5.552033 [dev_pixelpipe] took 0.340 secs (0.245 CPU) processed `local contrast' on GPU, blended on GPU [export]
5.579511 [dev_pixelpipe] took 0.027 secs (0.014 CPU) processed `color zones' on GPU, blended on GPU [export]
6.333490 [dev_pixelpipe] took 0.754 secs (5.514 CPU) processed `output color profile' on CPU, blended on CPU [export]
6.376401 [dev_pixelpipe] took 0.043 secs (0.316 CPU) processed `display encoding' on CPU, blended on CPU [export]
6.376648 [opencl_profiling] profiling device 0 ('GeForce GTX 960M'):
6.376710 [opencl_profiling] spent  0.1123 seconds in [Write Image (from host to device)]
6.376763 [opencl_profiling] spent  0.0019 seconds in rawprepare_1f
6.376812 [opencl_profiling] spent  0.0019 seconds in whitebalance_1f
6.376883 [opencl_profiling] spent  0.0019 seconds in highlights_1f_clip
6.376932 [opencl_profiling] spent  0.0074 seconds in ppg_demosaic_green
6.376978 [opencl_profiling] spent  0.0084 seconds in ppg_demosaic_redblue
6.377025 [opencl_profiling] spent  0.0011 seconds in border_interpolate
6.377072 [opencl_profiling] spent  0.0074 seconds in denoiseprofile_precondition_Y0U0V0
6.377118 [opencl_profiling] spent  0.2821 seconds in denoiseprofile_decompose
6.377163 [opencl_profiling] spent  0.0391 seconds in denoiseprofile_reduce_first
6.377209 [opencl_profiling] spent  0.0002 seconds in denoiseprofile_reduce_second
6.377259 [opencl_profiling] spent  0.0003 seconds in [Read Buffer (from device to host)]
6.377306 [opencl_profiling] spent  0.0672 seconds in denoiseprofile_synthesize
6.377351 [opencl_profiling] spent  0.0421 seconds in [Copy Image (on device)]
6.377397 [opencl_profiling] spent  0.0071 seconds in denoiseprofile_backtransform_Y0U0V0
6.377443 [opencl_profiling] spent  0.0010 seconds in blendop_set_mask
6.377489 [opencl_profiling] spent  0.0104 seconds in blendop_rgb
6.377533 [opencl_profiling] spent  0.2773 seconds in [Read Image (from device to host)]
6.377580 [opencl_profiling] spent  0.0038 seconds in hazeremoval_transision_map
6.377626 [opencl_profiling] spent  0.0693 seconds in hazeremoval_box_max_x
6.377674 [opencl_profiling] spent  0.0050 seconds in hazeremoval_box_max_y
6.377719 [opencl_profiling] spent  0.0957 seconds in hazeremoval_box_min_x
6.377765 [opencl_profiling] spent  0.0061 seconds in hazeremoval_box_min_y
6.377810 [opencl_profiling] spent  0.0057 seconds in guided_filter_split_rgb_image
6.377854 [opencl_profiling] spent  0.6577 seconds in guided_filter_box_mean_x
6.377899 [opencl_profiling] spent  0.0340 seconds in guided_filter_box_mean_y
6.377944 [opencl_profiling] spent  0.0063 seconds in guided_filter_covariances
6.377989 [opencl_profiling] spent  0.0080 seconds in guided_filter_variances
6.378033 [opencl_profiling] spent  0.0278 seconds in guided_filter_update_covariance
6.378081 [opencl_profiling] spent  0.0142 seconds in guided_filter_solve
6.378126 [opencl_profiling] spent  0.0070 seconds in guided_filter_generate_result
6.378132 [opencl_profiling] spent  0.0069 seconds in hazeremoval_dehaze
6.378143 [opencl_profiling] spent  0.0067 seconds in [Copy Image to Buffer (on device)]
6.378148 [opencl_profiling] spent  0.0002 seconds in [Write Buffer (from host to device)]
6.378151 [opencl_profiling] spent  0.0004 seconds in retouch_copy_buffer_to_buffer
6.378155 [opencl_profiling] spent  0.0002 seconds in retouch_copy_buffer_to_buffer_masked
6.378159 [opencl_profiling] spent  0.0062 seconds in retouch_copy_buffer_to_image
6.378162 [opencl_profiling] spent  0.0061 seconds in exposure
6.378165 [opencl_profiling] spent  0.0072 seconds in colorin_unbound
6.378168 [opencl_profiling] spent  0.3216 seconds in eaw_decompose
6.378170 [opencl_profiling] spent  0.0755 seconds in eaw_synthesize
6.378173 [opencl_profiling] spent  0.0019 seconds in blendop_mask_Lab
6.378176 [opencl_profiling] spent  0.0104 seconds in blendop_Lab
6.378179 [opencl_profiling] spent  0.0085 seconds in sharpen_hblur
6.378182 [opencl_profiling] spent  0.0073 seconds in sharpen_vblur
6.378184 [opencl_profiling] spent  0.0098 seconds in sharpen_mix
6.378187 [opencl_profiling] spent  0.0062 seconds in colorbalance_cdl
6.378190 [opencl_profiling] spent  0.0062 seconds in colorspaces_transform_rgb_matrix_to_lab
6.378193 [opencl_profiling] spent  0.0061 seconds in pad_input
6.378195 [opencl_profiling] spent  0.0738 seconds in gauss_reduce
6.378198 [opencl_profiling] spent  0.0399 seconds in process_curve
6.378201 [opencl_profiling] spent  0.0516 seconds in laplacian_assemble
6.378204 [opencl_profiling] spent  0.0069 seconds in write_back
6.378206 [opencl_profiling] spent  0.0127 seconds in colorzones_v3
6.378209 [opencl_profiling] spent  2.4817 seconds totally in command queue (with 0 events missing)
6.378229 [dev_process_export] pixel pipeline processing took 5.216 secs (16.973 CPU)

Some interesting points:

  • your i7-6500U is more or less matched to my i7-4720hq in benchmarks for single thread loads, but here it takes twice the time to process the image. I guess the difference here is 4 threads vs. 8 threads running in parallel.
  • despite being quite an old GPU, the 960M trounces the integrated Intel GPU. When people here points to the OpenCL speed-up in darktable, this is specifficaly related to discrete GPUs. Even if the Intel NEO driver can give you more speed than the CPU alone, it is no match for an AMD or NVidia card.
1 Like

Thanks. I went for Manjaro, and it went allot smoother than I expected. Cuda was installed with the proprietary Nvidia driver, just had a little fiddling with getting OpenCL to work. Some small packages I had to install separately.

Now the easiest part was getting Davinci working. I used the AUR repos to install DR, 16.2.7. I just let the script do its thing, downloading, compiling, installing and sorting dependencies. I believe it used a deb package as source. Zero manual fiddling, worked out of the box with Cuda. And I can both render and encode/export both h.264 and h.265. Don’t have mp3 but isn’t an issue at all.

I gotta say Linux has come far since last time i used it as deskop environment. And Manjaro seems snappy as h… And I love the Xkde interface. Now just remains to see if it stable. But this went beyond my expectations.

1 Like

I went for Manjaro, as you see in my other reply it went very smooth except for getting opencl working. Even my ntfs drives gor mounted and ii have r/w to them with no fuzz. Only thing I’m really missing now is a good vpn. My NordVPN sub is expiring anyway, so i could look for a alternative solution that is linux friendly. Perhaps with someone that offers openVPN support.

What?! I don’t believe you, please post a screenshot from the render pane :open_mouth:

edit: I guess it’s only supported for Nvidia GPUs:

:sob: :sob: :sob:

1 Like

Wouldn’t it be better to move the DaVinci part from this darktable issue to a new issue?

1 Like

Yes, move them to the Resolve forum where paid support people can support their non-FOSS software :wink:

1 Like

I can PM the screenshot, def have H.264/H.265

But back to Darktable indeed :slight_smile:
Just gave it a test, and seemed perfectly fine. But then hung on me and it didnt appear to write the last 5-6 history step the XMP. Had lost quite a bit of history when I got back in. As well as some modules in my favourite tab :frowning:

Ill start with -d all and see now.

I have used ProtonVPN for a few years on various Linux distros (currently Arch and Fedora) and iOS. It has always been fast and reliable for me. I use it with openVPN client.

1 Like

First I think that the user manual should explain how to debug Darktable. On windows you have to do as follows:

start C:“program files”\darktable\bin\darktable.exe -d perf -d opencl > log.txt 2>&1

the output is written to the hidden(!) file:

C:\Users[username]\AppData\Local\Microsoft\Windows\INetCache\darktable\darktable-log.txt

This is not something you just know!

Now on to the performance issue.

Exporting the example from @KristijanZic took 11,418 secs (15,422 CPU). This is fast and ok in my opinion. But, but the image and edits do not seem typical.

The image below took 85,214 secs (223,359 CPU) even though the edit only consists of a few steps including tone equalizer and denoise. I think this is slow.

I upload the rawfile, xmp and the log.


DSC_2575.NEF (29.9 MB) DSC_2575.NEF.xmp (7.2 KB)
darktable-log.3.txt (4.3 KB)

1 Like

DT is perfectly usable on Windows with OpenCL and everything, at least with nVidia cards. I use it everyday and no major loss of performance with respect to Linux (which I use for development because it compiles faster).
Also a fantastic work on the GUI has been done in 3.2, which is now fully themable, OS and DPI independent, much better than many other multiplatform SW.

Try avoiding tiling (see the links in one of my previous posts). Also, check why contrast equalizer (‘atrous’) was not processed using OpenCL.

60,572380 [opencl_atrous] couldn't enqueue kernel! -4
60,603627 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'atrous' in tiling mode: 0
60,603627 [opencl_pixelpipe] could not run module 'atrous' on gpu. falling back to cpu path
97,638092 [dev_pixelpipe] took 39,362 secs (138,781 CPU) processed `contrast equalizer' on CPU with tiling, blended on CPU [export]

Weird. Using darktable 3.3.0+1314~gcdaaee146 (git master), everything is much faster on my 12-year-old machine, Core2 Duo, 4 GB of RAM, NVidia 1060 with 6 GB of RAM, using your NEF and XMP:

35.891082 [dev] took 0.000 secs (0.000 CPU) to load the image.
36.154169 [export] creating pixelpipe took 0.236 secs (0.318 CPU)
36.154255 [pixelpipe_process] [export] using device 0
36.155219 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
36.184672 [dev_pixelpipe] took 0.029 secs (0.048 CPU) processed `raw black/white point' on GPU, blended on GPU [export]
36.188469 [dev_pixelpipe] took 0.004 secs (0.004 CPU) processed `white balance' on GPU, blended on GPU [export]
36.194689 [dev_pixelpipe] took 0.006 secs (0.010 CPU) processed `highlight reconstruction' on GPU, blended on GPU [export]
41.377531 [dev_pixelpipe] took 5.183 secs (8.119 CPU) processed `demosaic' on CPU with tiling, blended on CPU [export]
41.514513 [dev_pixelpipe] took 0.137 secs (0.145 CPU) processed `lens correction' on GPU, blended on GPU [export]
41.517665 [dev_pixelpipe] took 0.003 secs (0.001 CPU) processed `exposure' on GPU, blended on GPU [export]
41.817558 [dev_pixelpipe] took 0.300 secs (0.421 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
41.837743 [dev_pixelpipe] took 0.020 secs (0.018 CPU) processed `input color profile' on GPU, blended on GPU [export]
41.882531 [dev_pixelpipe] took 0.045 secs (0.022 CPU) processed `denoise (non-local means)' on GPU, blended on GPU [export]
41.908088 [dev_pixelpipe] took 0.026 secs (0.000 CPU) processed `contrast equalizer' on GPU, blended on GPU [export]
41.936057 [dev_pixelpipe] took 0.028 secs (0.012 CPU) processed `local contrast' on GPU, blended on GPU [export]
41.942914 [dev_pixelpipe] took 0.007 secs (0.003 CPU) processed `output color profile' on GPU, blended on GPU [export]
42.020912 [dev_pixelpipe] took 0.078 secs (0.086 CPU) processed `dithering' on CPU, blended on CPU [export]
42.059684 [dev_pixelpipe] took 0.039 secs (0.061 CPU) processed `display encoding' on CPU, blended on CPU [export]
42.059903 [opencl_profiling] profiling device 0 ('GeForce GTX 1060 6GB'):
42.059915 [opencl_profiling] spent  0.0519 seconds in [Write Image (from host to device)]
42.059989 [opencl_profiling] spent  0.0016 seconds in rawprepare_1f
42.060059 [opencl_profiling] spent  0.0015 seconds in whitebalance_1f
42.060126 [opencl_profiling] spent  0.0027 seconds in highlights_1f_lch_bayer
42.060204 [opencl_profiling] spent  0.1669 seconds in [Read Image (from device to host)]
42.060272 [opencl_profiling] spent  0.0330 seconds in [Write Buffer (from host to device)]
42.060339 [opencl_profiling] spent  0.0006 seconds in lens_vignette
42.060407 [opencl_profiling] spent  0.0028 seconds in lens_distort_lanczos3
42.060473 [opencl_profiling] spent  0.0004 seconds in exposure
42.060539 [opencl_profiling] spent  0.0005 seconds in colorin_unbound
42.060618 [opencl_profiling] spent  0.0002 seconds in nlmeans_init
42.060683 [opencl_profiling] spent  0.0039 seconds in nlmeans_dist
42.060748 [opencl_profiling] spent  0.0026 seconds in nlmeans_horiz
42.060812 [opencl_profiling] spent  0.0057 seconds in nlmeans_vert
42.060877 [opencl_profiling] spent  0.0089 seconds in nlmeans_accu
42.060942 [opencl_profiling] spent  0.0006 seconds in nlmeans_finish
42.061014 [opencl_profiling] spent  0.0004 seconds in [Copy Image (on device)]
42.061079 [opencl_profiling] spent  0.0109 seconds in eaw_decompose
42.061144 [opencl_profiling] spent  0.0038 seconds in eaw_synthesize
42.061210 [opencl_profiling] spent  0.0004 seconds in pad_input
42.061276 [opencl_profiling] spent  0.0039 seconds in gauss_reduce
42.061339 [opencl_profiling] spent  0.0029 seconds in process_curve
42.061403 [opencl_profiling] spent  0.0043 seconds in laplacian_assemble
42.061467 [opencl_profiling] spent  0.0004 seconds in write_back
42.061531 [opencl_profiling] spent  0.0010 seconds in colorout
42.061594 [opencl_profiling] spent  0.3119 seconds totally in command queue (with 0 events missing)
42.061730 [dev_process_export] pixel pipeline processing took 5.907 secs (8.953 CPU)

The main differences:
You: 39,631271 [dev_pixelpipe] took 21,260 secs (74,266 CPU) processed `tone equalizer’ on CPU, blended on CPU [export]

Me: 41.817558 [dev_pixelpipe] took 0.300 secs (0.421 CPU) processed `tone equalizer’ on CPU, blended on CPU [export]

You: 58,275698 [dev_pixelpipe] took 18,082 secs (2,172 CPU) processed `denoise (non-local means)’ on GPU, blended on GPU [export]

Me: 41.882531 [dev_pixelpipe] took 0.045 secs (0.022 CPU) processed `denoise (non-local means)’ on GPU, blended on GPU [export]

You: 97,638092 [dev_pixelpipe] took 39,362 secs (138,781 CPU) processed `contrast equalizer’ on CPU with tiling, blended on CPU [export]

Me: 41.908088 [dev_pixelpipe] took 0.026 secs (0.000 CPU) processed `contrast equalizer’ on GPU, blended on GPU [export]

Im just curious, should i be happy with this? Its a fairly typical outdoor portrait edit for me.

30020,848272 [dev_pixelpipe] took 0,000 secs (0,000 CPU) initing base buffer [export]
30020,859474 [dev_pixelpipe] took 0,011 secs (0,007 CPU) processed raw black/white point' on GPU, blended on GPU [export] 30020,861906 [dev_pixelpipe] took 0,002 secs (0,001 CPU) processed white balance’ on GPU, blended on GPU [export]
30020,865500 [dev_pixelpipe] took 0,004 secs (0,002 CPU) processed highlight reconstruction' on GPU, blended on GPU [export] 30020,882585 [dev_pixelpipe] took 0,017 secs (0,009 CPU) processed demosaic’ on GPU, blended on GPU [export]
30020,893199 [dev_pixelpipe] took 0,011 secs (0,003 CPU) processed orientation' on GPU, blended on GPU [export] 30020,990140 [dev_pixelpipe] took 0,097 secs (0,249 CPU) processed retouch’ on GPU, blended on GPU [export]
30021,004332 [dev_pixelpipe] took 0,014 secs (0,003 CPU) processed exposure' on GPU, blended on GPU [export] 30021,022175 [dev_pixelpipe] took 0,018 secs (0,007 CPU) processed input color profile’ on GPU, blended on GPU [export]
30021,951867 [dev_pixelpipe] took 0,930 secs (3,346 CPU) processed contrast equalizer' on GPU, blended on GPU [export] 30022,460872 [dev_pixelpipe] took 0,509 secs (0,689 CPU) processed lowpass’ on GPU, blended on GPU [export]
30022,472167 [dev_pixelpipe] took 0,011 secs (0,004 CPU) processed color balance' on GPU, blended on GPU [export] 30022,912286 [dev_pixelpipe] took 0,440 secs (0,739 CPU) processed color balance Eyes’ on GPU, blended on GPU [export]
image colorspace transform Lab–>RGB took 0,026 secs (0,563 CPU) [filmicrgb ]
30023,182908 [dev_pixelpipe] took 0,271 secs (2,725 CPU) processed filmic rgb' on CPU, blended on CPU [export] image colorspace transform RGB-->Lab took 0,004 secs (0,001 GPU) [tonecurve ] 30023,243620 [dev_pixelpipe] took 0,061 secs (0,046 CPU) processed tone curve’ on GPU, blended on GPU [export]
30023,771806 [dev_pixelpipe] took 0,528 secs (0,784 CPU) processed local contrast Eyes' on GPU, blended on GPU [export] 30023,791621 [dev_pixelpipe] took 0,020 secs (0,012 CPU) processed output color profile’ on GPU, blended on GPU [export]
30023,860325 [dev_pixelpipe] took 0,069 secs (0,678 CPU) processed display encoding' on CPU, blended on CPU [export] 30023,860375 [dev_process_export] pixel pipeline processing took 3,018 secs (9,307 CPU) [export_job] exported to /run/media/tore/WORKHORSE/CamDump/20203110 autumn shoot/darktable_exported/DSC_5195.jpg’

Yes, this is a remarkable difference.
Have you any kind of special performance settings on your system or in DT? And how do I check why contrast equalizer (‘atrous’) was not processed using OpenCL?..but this is apparently not the only problem.

In short: I (and maybe others) will appreciate some help and good advice in this matter because I would love to obtain the kind of performance that you experience.

My pc is: Intel® Core™ i7-4510U CPU @ 2.00GHz 2.60 GHz with 8,00 GB RAM

NVIDIA GeForce GTX 850M

Well, I’m copying here parts of my config that might be relevant (though I suspect something else is the problem), for what it’s worth. I do not have any special settings, except for those recommended for 32-bit systems (I run 64 bit Linux, but with only 4 GB of RAM). I have ‘performance mode’ enabled, but that should not affect exports (and even if it did, it should not produce speed-ups in the 100x-1000x range).

$ grep -iE "(opencl)|(perf)|(mem)|(thread)" darktablerc  
cache_memory=536870912
host_memory_limit=500
opencl=TRUE
opencl_async_pixelpipe=false
opencl_avoid_atomics=false
opencl_checksum=2160716749
opencl_device_priority=*/!0,*/*/*/!0,*
opencl_disable_drivers_blacklist=false
opencl_library=
opencl_mandatory_timeout=200
opencl_memory_headroom=400
opencl_memory_requirement=768
opencl_micro_nap=1000
opencl_number_event_handles=25
opencl_scheduling_profile=very fast GPU
opencl_size_roundup=16
opencl_synch_cache=active module
opencl_use_cpu_devices=false
opencl_use_pinned_memory=false
performance_configuration_version_completed=1
plugins/lighttable/preview/max_in_memory_images=4
ui/performance=TRUE
worker_threads=1

Thank you for a quick response.

Where do I put: $ grep -iE “(opencl)|(perf)|(mem)|(thread)” darktablerc

The rest of the settings are in darktablerec. Right?

I have noted the following differences only. I have:

host_memory_limit=1500

opencl_checksum=3601598440

opencl_device_priority=!1,s/s/s/s - s means “star”

opencl_scheduling_profile=default

No ui/performance=TRUE statement

worker_threads=8

Can I just copy all your settings? What about opencl checksum?

The ‘grep’ stuff was just a command to show which values I extracted (‘grep’ is a tool to find text in files). Don’t put it in your config file! :slight_smile:

You have the default host_memory_limit, it should be fine. I only have 4 GB, so I use a smaller number.
Don’t touch the checksum.
opencl_device_priority depends on your machine. You probably have an integrated device as well as a card, right? I’d assume that’s why we have different numbers.
opencl_scheduling_profile - compared to your graphics card, you have a much faster CPU than I do (a relatively modern NVidia 1060, but an ancient Core2 CPU). You should leave it as it is.
ui/performance - I think that’s because I’m on the master (unstable, development) version.
worker_threads – again, I had to reduce that because of my old machine (with 8 concurrent operations, I’d run out of RAM). Anyway, that setting only affects thumbnail generation, as far as I know.

Maybe getting one of the developers here would help? @hanatos, perhaps? (I mentioned you because some people experience 100x - 1000x the module run times than I do on my 12-year-old machine - see above: Performance issues - #33 by kofa. Not sure whom to contact, really.)

Ok, but is that graphics card actually used? Nothing in the information you provide shows that.

And I know that with some other programs, the NVidia needs to be explicitly selected, and you need the drivers from NVidia, not the ones that come through standard windows update (assuming you use windows, though for linux you’d also need the drivers provided by NVidia, and not the open source nouveau driver)

The use or non-use of the graphics card does not explain this:
obe ( Intel® Core™ i7-4510U CPU @ 2.00GHz 2.60 GHz with 8,00 GB RAM): 39,631271 [dev_pixelpipe] took 21,260 secs (74,266 CPU) processed `tone equalizer’ on CPU, blended on CPU [export]

kofa (Intel Core2 Duo @ 2.33 GHz with 4 GB RAM): 41.817558 [dev_pixelpipe] took 0.300 secs (0.421 CPU) processed `tone equalizer’ on CPU, blended on CPU [export]