darktable speed (in general, and when using two monitors)

I considered @aadm’s speedup suggestion and wondered if it was at least partially related to nuking the darktable’s configs.

As it would take several days for me to reimport all of my photos, I really didn’t want to wipe out my database and start over.

I’m very happy to report, at least in my case, I found a few major speedups that helped me immensely (most likely in order of speed wins):

  • Back up your current library, then defrag and compact the database with vacuum (mine went from 201 MiB to 177 MiB):

    cp library.db library-`date -I`.db
    sqlite3 library.db 'VACUUM;'
    
  • Move darktable configs out of the way (while darktable isn’t running) and let darktable create new ones by running it again afterward (this moves it to a file with the current date, in ISO format):

    mv darktablerc darktablerc-`date -I`
    

    Having darktable recreate your configuration file means that you’ll lose a few settings. One of the most important ones are which modules are turned on and favorited. You can grep through your old configuration file and copy/paste the lines into your new darktablerc. (First make sure darktable is not running, of course.)

    Here’s a one-liner that will do this for you:

    grep -i -P "plugin.*(favorite|visible)=true" darktablerc-`date -I` >> darktablerc
    

    (If you want to also preserve the disabled modules too, just remove the true part of the command above.)

    Pretty much everything else is a preference. You can compare the files and copy over the parts you want or just toggle it in the UI after starting darktable again.

  • Be selective of Lua plugins you choose to enable.

    • In my case, I had image_path_in_ui turned on, and it seems to be extremely slow — probably even checking the database and file on every hover in lighttable mode. This was what made darktable almost unbearably slow when going through photos.
    • Instead, I found that there’s a relatively newly updated OpenInExplorer plugin which not just works on Windows, but also on Linux (with Nautilus). As I wanted the path for the image to open it up in Nautilus anyway, this new one was a huge performance boost (as it only runs when I want it to) and even made my life a little better too. :wink:

Happy photo editing, everyone!

2 Likes

Good detective work @garrettn

I recently thought my darktable was running a bit slow. My editing machine is not connected to the network, and hasn’t been for over a year. I had plugged it in and updated a few weeks ago. Turns out my slowness was coming from the latest speculative execution patches to the kernel. :frowning:

Figured I would add some results to this just because it is easy to do :smiley:

These results are from a clean install of everything. Testing out Ubuntu 19.04.
Running the darktable 2.6.2 ppa. No files imported yet so running what I would assume is a clean db minus a few setting tweaks.

My results seem quite comparable to what others have found.

Ryzen 7 2700 3.20ghz (stock clock speed)
Nvidia 980TI using the latest 418 driver in ubuntu
16GB Ram running in dual channel clocked at 3000

OpenCL:      3.307 secs
CPU Only:   13.110 secs

I know I’m late to the party on this thread, but this really caught my eye with DT. As background I have been a long time user of Rawtherapee and have been using DT a lot of late due to the masking tools etc that RT lacks.

While using DT generally isn’t too bad, one thing I really noticed was the lack of fluidity or general smoothness when using the curve tools in DT over RT. Having used DT quite a bit I found that for many images I didn’t really need the mask tools, so moved back to RT to find adjusting the tone curve in RT a sheer delight.

For what it’s worth I’ve been using DT with both opencl and without it and found the same general “roughness” to the experience of adjusting with the curve modules.

Peter.

1 Like

@plaven
Never too late. I am also on for performance…

What is that “;” after VACUUM used for?

VACUUM; is a special type of SQL command – all SQL commands need to be terminated by a semicolon (;).

GTX1060 and Threadripper 1950x. 64GB ram. Ubuntu 18.04.

GPU: 5,886015 [dev_process_export] pixel pipeline processing took 5,206 secs (49,141 CPU)
CPU: 9,813531 [dev_process_export] pixel pipeline processing took 9,248 secs (242,660 CPU)

I changed opencl_memory_headroom=300 to 1200 and got:
5,198685 [dev_process_export] pixel pipeline processing took 4,478 secs (22,712 CPU)

1 Like

Digging up this old thread. Hope the benchmark is still relevant for DT 3.0.1

I have a 3rd gen i5, 16gb ram and a gtx 1050 Ti. I got 14.358s in opencl, and 36.189s using only cpu.

OpenCL -

CPU Only -

Is adding a faster gpu likely to make a significant difference? Something like a 2060 maybe? Currently I find the slowest part of DT is parametric masks.

Yes.

Last time I clocked darktable using a GTX-1050 and a Ryzen 2700X CPU I got this result

7.540 seconds with openCL
11.020 seconds without openCL

Since then I have upgraded the machine to a GTX -1660 Ti and a Ryzen 3900X.
Now I get this result:

2.781 seconds with openCL
8.521 seconds without openCL

Have fun!
Claes in Lund, Sweden

Since I have noticed a marked improvement of the results using the latest Darktable 3.0.1 especially with regards to using CPU only, I will add these results to the thread.

I’m using the same laptop as in my original post up here (Dell XPS-15, i7-7700HQ@2.8Ghz, Geforce GTX1050, 16 Gb ram, 512 Gb ssd):

With GPU:

$ darktable-cli bench.SRW test.jpg --core -d perf -d opencl
[...]
12,562578 [dev_process_export] pixel pipeline processing took 11,683 secs (41,596 CPU)
12,958019 [opencl_summary_statistics] device 'GeForce GTX 1050' (0): 543 out of 544 events were successful and 1 events lost

Only CPU:

$ darktable-cli bench.SRW test.jpg --core -d perf --disable-opencl
[...]
23,373814 [dev_process_export] pixel pipeline processing took 22,616 secs (173,909 CPU)

In summary (old results in brackets):

  • CPU: 22.616 s (80 s)
  • GPU: 11.683 s (13 s)

That’s a sweet upgrade.

Has anyone checked the diff made by only changing the gpu?

Has anyone checked the diff made by only changing the gpu?

Of course!

Using GTX-1050 = 7.540/11.020 seconds (with openCL/without openCL)
Using GTX-1660 = 3.078/11.091 seconds (with openCL/without openCL)

I have no GPU and only i5 4th gen with only Intel 500 graphics HD and on Linux, I have just see significant speed by changing kernel release from 5.3.x to 5.4.x. I’ve tried also 5.5.x (better than 5.3) but find 5.4.x the faster one with darktable. If you use Linux, considering which kernel you use seems also a good thing. Of course, it’s just my using so could be different on a different PC.

I have the impression that most performance issues discussed here are focused on the processing times of the pixel pipeline.
Does anyone experience UI lags during processing in the Darkroom?
Especially on complex edits, as soon as a parameter of some module is changed, the whole GUI becomes very unresponsive or freezes until the processing is done. This makes editing images quite frustrating because even simple adjustments like dragging a slider or curve nodes are next to impossible.
Is it only me who is experiencing this? I wonder if I’m doing something wrong.
(Using current master build on Arch; tried with and without Opencl)
I’ve also created an issue in the bug tracker in case someone is interested:

The performance depends on the resoltuion of your screens - each change of sliders etc. Causes a reprocessing of the main view and the preview.
Try to start darktable -d perf to get an idea, which are the most time consuming steps (denoise or retouch are one of them) and then activate them at the end of your editing …

New graphic card installed today.

KFA2 RTX2080 Super EX 8GB and Threadripper 1950x. 64GB ram. Ubuntu 18.04.

3,031747 [dev_process_export] pixel pipeline processing took 2,203 secs (20,788 CPU)

My old card below in the link.

If these values are of any interest, we should perhaps collect them in a table for a better overview. Maybe we can even combine this with GPU benchmarks in darktable.

From my testings (n=3 and CPU only) I get these values:

Intel i5-3210M (2 cores/4 threads with 2.5–3.1 GHz, Linux Mint 18.3): 50 s
Intel i5-4590 (4 cores/4 threads with 3.3–3.7 GHz, Windows 7): 37 s
AMD Ryzen 3 2200G (4 cores/4 threads with 3.5–3.7 GHz, Linux Solus 4.1): 28 s

Note: OpenCL via Beignet on the integrated HD 4600 of the i5-4590 slows it down to 47 s.

Despite the numbers indicating a roughly twofold increase from the mobile i5 to the Ryzen, I found darktable sometimes not very ‘fluid’. Unfortunatly OpenCL isn’t supported on the Ryzen under Solus, but I guess it won’t make a huge difference.
Especially switching between developed RAWs to compare them in fullscreen takes some time until they appear sharp (on a 1680 x 1050 screen). And while improved and new modules are nice, they are often quite heavy on the computational side.

Tried again with 3.6.0. Got 3,721 secs (97,656 CPU) instead of 9,248 secs.
With GPU I got 2,010 secs instead of 2,203 sec.

Astrophoto denoise that has been speeded up in 3.6.0?

Darktable has contributors that managed to improve performance of several modules and parts of darktable, loads of which landed with 3.6 so now speeds are better and contrinue to get better.

see how much was done that landed in 3.6: Pull requests · darktable-org/darktable · GitHub

and see how much gets done for 3.8: Pull requests · darktable-org/darktable · GitHub

2 Likes