RT significantly slower on Windows than on Linux

Hi,

I just discovered that RT is significantly slower on Windows than on Linux. I tested this on Windows 10, Debian 11 (mainline kernel) and Garuda (Arch) Linux with the Zen kernel. Desktop environment on Linux is Qtile resp. Xfce.

Hardware: Ryzen 7 5800X, 16 GB RAM, Graphics Nvidia Geforce 1660 Super, screen resolution 1920x1200.

I browsed through raws from my Olympus EM5 Mark III (20 megapixels), jumping to the next photo took in average 1.7 sec on Linux and 2.3-2.6 sec on Windows.
It was the same raws that I browsed through, they are on the internal ssd, the default processing profile was applied. RT version is the latest development build from Github (dev branch).

Probably there is a better way to do this but I used my smartphone to measure this, some preinstalled Samsung app. But really there is no need to measure this because the difference is so sicinificant that I can feel it very clearly.

I observed something similar with darktable 1-2 years ago, don’t know if that was fixed meanwhile.

I will probably do more tests with DE’s such as KDE Plasma.

What do you think? Has anybody observed something similar?

Moinchen, Anna!

What Nvidia drivers do you have
on Win, Debian, and Garuda?

MfG
Claes in Lund, Schweden

I don’t think that the graphics driver has something to do with this since RT does no GPU acceleration.
Btw, on Linux it’s the drivers from the repo (new on Garuda/Arch and older on Debian), on Windows it’s the newest driver, it was updated today.

Anna,
There is at least 13% difference in execution times
between two recent Nvidia drivers — both when clocked
with openCL and without openCL.
Example: between 470.141.03 and 515.65.01.

I am collecting more clues…

/Claes

@betazoid Although this reply won’t be helpful, it might yet be accurate…

Everything I do on my Linux desktop (old hardware) or laptop (even older) is smoother, faster, and less “crashy” than on either Windows or Mac OS. Without the bloat-ware and other corporate-ware, Linux is simply a faster platform to get just about anything done.

Don’t know if there are other issues making your Windows DT installation more sluggish than usual, but faster processes on Linux sounds like a normal comparison.

Do others see no difference across these OS environments and software processing times?

Are you speaking about RT (RawTherapee) or dt (darktable)?

@kofa In my case, I am clocking darktable
with/without openCL, where Nvidia’s proprietary
drivers seem to misbehave.

Ref: More than 140% of performance loss with drivers superior to 470 series - Linux - NVIDIA Developer Forums where you can see that the problem
does not only concern darktable, nor only Linux.

In the link you provided, people are debating gaming performance.

RawTherapee, to the best of my knowledge, does not use the graphics card for accelerating computations.

RawTherapee doesn’t use openCL though :wink:

1 Like

That has been my experience too. Everything is slower on Windows. (Disclaimer: I guess gaming is supposed to be faster on Windows, but I’m not a gamer.)

1 Like

I can say with certainty that I am slower on Windows. :wink:

I am a Windows user and only dabble in Linux VMs. My experience with RT is that it’s certainly not terrible on Windows in terms of speed. However, if Anna is describing a significant effect, I am tempted to install a dual boot environment just to test things on exactly the same hardware.

@betazoid which versions of RT do you use, or do you compile it yourself (native) on both systems?
Edit: I read now you use the GitHub compiled builds. It would be interesting to see what difference a native build would provide.

I use the versions that can be downloaded from Github, latest version of the dev branch, not compliled, appimage resp. The exe installer.

I can compile, but I think I have not compiled RT on windows yet. Maybe tomorrow.

Some possible causes (since NVidia drivers are rather unlikely to influence a CPU-bound application):

  • Maybe the Windows and the Linux build environment / scripts use different compilers, or at least different flags?
  • Windows may have more overhead, as it usually has some anti-virus running (Windows Defender is built in).
  • Self-compiled binaries (optimised for the user’s CPU) may also be faster than generic versions.

@kofa
Latest clockings:

D: darktable version
E: seconds with openCL
F: seconds without openCL
G: distro
H: nvidia driver version

clockings

Have fun!
Claes in Lund, Sweden

I see no normal reason why it would be slower on windows vs Linux. Maybe compiler differences or parameters used , or differences in threading libraries , etc…

But most programs that calculate stuff (like a ffmpeg cpu encode dor instance ) are not faster on Linux compared to macOS or windows. Or it’s at least in the margins of error area.

Now, i do know that windows is slower in how it handles process starts . Starting a process is more involved in windows and takes longer. So things like big compiles - that basically spawn a gcc process that finishes quickly and then quickly need to spawn another - are way faster on Linux (or non-windows os, let’s say it like that :wink: ).

Could it be that rawtherapee spawns external processes on opening files , maybe exiftool or something to get some information? Because that could be an explanation for the longer times.

Native Linux is quite some time ago for me , so i can’t compare my modern windows rawtherapee to 'what i used to feel back in the day '… But it still feels snappy for me .

@Claes: There are too many factors changing at once.
Look at 4.1.0+391 on Manjaro vs 4.1.0+387 on Kubuntu, and compare the rows using the same NVidia version without OpenCL (so, fix the NVidia version, and for each fixed version, compare the relative performances of Linux + darktable versions):

NVidia 470: 4.392 vs 4.102, or 1.07:1 (caused by changing distro + dt version)
NVidia 515: 4.309 vs 4.095, or 1.05:1 (caused by changing distro + dt version)

Now, if you fix the distro and darktable version, and vary the NVidia driver version, you get:
Kubuntu + 4.1.0+387: 4.102 vs 4.075 (slowest/fastest runs): 1.007:1 (caused by NVidia version change)
Manjaro + 4.1.0+391: 4.392 vs 4.309: 1.019:1 (caused by NVidia version change)

To me, that looks like the Linux + dt version means much more difference (5-7%) than the NVidia version (0.7 - 1.9%). Actually, a measurement difference in the range of 1% could easily be ‘noise’ (caused by other software running during the measurement, like cron jobs and the like).

Plus, if you really want to benchmark, export a given set of pictures to produce longer runs and reduce measurement noise. Be sure to clear the darktable mipmap cache before each batch export, and use the same configuration, of course. And vary one parameter at a time.

This issue may explain some of the performance difference. There is a problem with recent versions of the GCC compiler that we work around by disabling an optimization. That results in a performance drop in some places (capture sharpening, for example). The Windows build uses an up-to-date version of GCC (currently 12.2.0) to compile RawTherapee and is affected. The AppImage build process runs on Ubuntu 18.04 which has the much older GCC 7.5.0 and is not affected.

2 Likes

Apart from the issue Lawrence linked to, this also might be reason why the file browser can be slower. Processing and adjusting sliders shouldn’t be affected by this.