darktable 3.4/3.5 opencl slow on Windows 10

Btw, I also tested this on my Intel i7 laptop, and that one is faster than my brand new desktop.
So, looks like there is nothing directly wrong with darktable for Windows.
However, on Linux, on my laptop, I am not sure whether there is much benefit from OpenCL - apparently there is not so much difference with and without Opencl.
I think I am a bit confused.

maybe preview pipe and ful pixel pipe are both processed on the same device.
try darktable -d perf -d opencl | grep -e'dev_process_' -e'using device' to get an idea which pipe is processed on which device.

To speedup stuff it might be helpful to manually set prioritization in darktablerc.

For my system configuration prioritizing the more powerful device for full pixelpipe, and explicitly deprioritize this device for preview pipe (opencl_device_priority=1,0,*/!1,*/1,0,*/*/*) increased overall performance for me. You need to play around with these settings …

if that doesnt help you also can reduce the preview image to 1/2 or even 1/4 size in preferences to speed up this.

1 Like

I don’t quite understand. My desktop PC only has one OpenCL device, so there is no point in setting the device priority oder, in there?
Nevertheless I will try the command you suggested.
However, if I exclude the device from processing some preview, does that mean that the CPU will be used instead?
I am more and more confused. Why is there a problem on Windows while on Linux everything is fine? The settings are the same, aren’t they? The software version is the same.

you can check the opencl capable devices with darktable-cltest. this also gives you the numbering of your gpu devices.
Usually the cpu’s on board gpu can also support opencl. But maybe you need to disable blacklisting in darktablerc first - but this depends on your cpu.

1 Like

but not the drivers :wink:

1 Like

According to darktable-cltest There is only one device 0 which is the Nvidia. What a surprise! My system does not have any igpu.

I think I will create a second user account and test again. Something is messed up.

I just upgraded the Nvidia driver (Win) on my laptop. I think the driver is broken. It’s slower - as slow as without opencl.

Just a guess: related to disabling crypto-mining in the NVIDIA drivers …

2 Likes

I don’t quite understand what crypto mining is.

I also have an Arch-based Linux on a pendrive. Is has the 465 Nvidia driver, performance seems to be ok. On the internal ssd I have Debian Bullseye. Obviously that driver is even older, I think 460 or so.
But the first Nvidia driver that I installed on Windows in May was the 462 driver. And there was a performance issue already.

1 Like

Your observations sound more as if the issue is more related to the quality of the drivers, i.e. Windows vs. Linux than related to some restrictions introduced into the drivers recently.

1 Like

I guess the graphics cards that I have are not really suited for crypto mining. The MX250 is ridiculous and the GTX 1660 Super is not so powerful either.
Something is wrong here… something is mysterious.
Someone should test whether opencl works with other apps on Windows.
I have googled a lot because of this problem and I did not find anything helpful or even similar. It’s not possible that none has upgraded their driver recently. It’s not possible that nobody noticed that the Nvidia driver for Windows is broken, is it?

You could try Geekbench or something similar.
Geekbench 5 - Cross-Platform Benchmark – there’s some kind of free /trial version, see Geekbench 5 - Cross-Platform Benchmark

Or PerformanceTest FAQ Index - and they have a large database of test measurements, so you can compare yours with those of others.

1 Like

I have a Win10 laptop and 3.4.1 with these specs:

i7-8565u, Nvidia Geforce MX250 2gb (latest driver), Intel UHD 620, 32gb ram, 512gb SSD, 1920x1200 monitor

I opened up a 20mp Olympus PEN-F raw file and used denoise (profiled) using your settings. Even with my low power MX250 the screen update only takes 1-2 seconds.


dn

Good luck. I hope you can find the solution to the problem you are having.

Should you be on default in OPENGL DT setting or would the faster graphics settings work better?I still don’t think this is the problem I have a crappy older NVIDIA card that uses OPENGL with ON1 photo raw and it seems to work fine. I think you should maybe try as @kofa says to benchmark it and see if the numbers seem okay just for the hardware as it is configured. GpuTest - Cross-Platform GPU Stress Test and OpenGL Benchmark for Windows, Linux and OS X | Geeks3D.com

Also are there any setting that could be needed in the bios. You might just want to review those to the best of your ability in case there is something weird there like the cache is disable or something stupid…just grasping at straws but you never know

EDIT…maybe see if you have a bios update??

Any chance your motherboard has onboard gpu?? If so maybe be sure it is disabled If your system benchmarks without DT involved are slow maybe review your BIOS settings one by one and if there is a bios update maybe give that a try. Hope you find the problem…must be very frustrating

NVidia provides also some tests especially for opencl

If those run fine you can at least exclude that your issue is hardware / bios / driver related.

2 Likes

Guys, I am telling you this is not a problem that just I have. The new Nvidia drivers are broken. But it is difficult to notice it if you don’t have both Linux and Windows.
What I am trying right now: I am downloading the oldest driver that is still available form Nvidia (457/451). Let’s see if that one is broken too.

I also applied some other modules since I had to create some noise to remove. I’d say 2 seconds for denoise only is quite slow.

When you use your MX250 how long does it take?

On Windows, I think I get the same performance as you 1.something seconds if only basecurve + denoise non local auto are active.
I think there is no significant difference between with and without opencl.
I am about to download msys2 so I can actually measure the performance on Windows.
On Debian, dt seems to be a bit faster, but there is no difference between opencl and no opencl either. Actually according to darktable -d perf, without opencl it is even slightly faster, I measured 0.9 seconds. With opencl seems to be 1.1 seconds.

Edit: On Windows, 2.0 seconds with opencl and 1.5 seconds without opencl (according to darktable -d perf).

I just did some performance tests with my systems, too. I do have two Linux-Systems running the same darktable versions here: an old one (I7-4700HQ with Geforce GT 750M) and a newer one (I7-7820HQ with Quadro M1200) and I can observe similar behaviours, too.
The 4700HQ system is faster with opencl disabled while the 7820HQ system is faster with opencl enabled.

I took an example image and run the export from darktable with different settings:
4700HQ-GPU-enabled: pixel pipeline processing took 46.309 secs (59.258 CPU)
4700HQ-GPU-disabled: pixel pipeline processing took 29.932 secs (222.965 CPU)
so the 4700HQ system is almost 50% faster when not using the GPU

while
7820HQ-GPU-enabled: pixel pipeline processing took 12,010 secs (20,289 CPU)
7820HQ-GPU-disabled: pixel pipeline processing took 21,378 secs (162,917 CPU)
so the 7820HQ system is almost double the speed having the GPU enabled

nevertheless denoise on all system took round about 2/3 of processing time independently whether GPU was used or not.

Looks like the relationship between CPU- and GPU-performance is very important here. If you have a fast CPU but a low to medium fast GPU enabling GPU does not help much in processing, in contrary it might even run slower.

@betazoid your CPU is just so fast that the GPU does not give you an additional boost in performance. The only chance I see is to distribute CPU/GPU power for calculating preview and full image as @MStraeten already suggested.

2 Likes