RT significantly slower on Windows than on Linux

kofa · September 17, 2022, 3:51pm

In the link you provided, people are debating gaming performance.

RawTherapee, to the best of my knowledge, does not use the graphics card for accelerating computations.

paperdigits · September 17, 2022, 4:54pm

RawTherapee doesn’t use openCL though

montagdude · September 17, 2022, 6:24pm

That has been my experience too. Everything is slower on Windows. (Disclaimer: I guess gaming is supposed to be faster on Windows, but I’m not a gamer.)

ejm · September 17, 2022, 6:30pm

I can say with certainty that I am slower on Windows.

Thanatomanic · September 17, 2022, 6:30pm

I am a Windows user and only dabble in Linux VMs. My experience with RT is that it’s certainly not terrible on Windows in terms of speed. However, if Anna is describing a significant effect, I am tempted to install a dual boot environment just to test things on exactly the same hardware.

@betazoid ~~which versions of RT do you use, or do you compile it yourself (native) on both systems?~~
Edit: I read now you use the GitHub compiled builds. It would be interesting to see what difference a native build would provide.

betazoid · September 17, 2022, 6:33pm

I use the versions that can be downloaded from Github, latest version of the dev branch, not compliled, appimage resp. The exe installer.

betazoid · September 17, 2022, 6:37pm

I can compile, but I think I have not compiled RT on windows yet. Maybe tomorrow.

kofa · September 17, 2022, 7:08pm

Some possible causes (since NVidia drivers are rather unlikely to influence a CPU-bound application):

Maybe the Windows and the Linux build environment / scripts use different compilers, or at least different flags?
Windows may have more overhead, as it usually has some anti-virus running (Windows Defender is built in).
Self-compiled binaries (optimised for the user’s CPU) may also be faster than generic versions.

Claes · September 17, 2022, 7:31pm

@kofa
Latest clockings:

D: darktable version
E: seconds with openCL
F: seconds without openCL
G: distro
H: nvidia driver version

clockings

Have fun!
Claes in Lund, Sweden

jorismak · September 17, 2022, 7:32pm

I see no normal reason why it would be slower on windows vs Linux. Maybe compiler differences or parameters used , or differences in threading libraries , etc…

But most programs that calculate stuff (like a ffmpeg cpu encode dor instance ) are not faster on Linux compared to macOS or windows. Or it’s at least in the margins of error area.

Now, i do know that windows is slower in how it handles process starts . Starting a process is more involved in windows and takes longer. So things like big compiles - that basically spawn a gcc process that finishes quickly and then quickly need to spawn another - are way faster on Linux (or non-windows os, let’s say it like that ).

Could it be that rawtherapee spawns external processes on opening files , maybe exiftool or something to get some information? Because that could be an explanation for the longer times.

Native Linux is quite some time ago for me , so i can’t compare my modern windows rawtherapee to 'what i used to feel back in the day '… But it still feels snappy for me .

kofa · September 17, 2022, 7:49pm

@Claes: There are too many factors changing at once.
Look at 4.1.0+391 on Manjaro vs 4.1.0+387 on Kubuntu, and compare the rows using the same NVidia version without OpenCL (so, fix the NVidia version, and for each fixed version, compare the relative performances of Linux + darktable versions):

NVidia 470: 4.392 vs 4.102, or 1.07:1 (caused by changing distro + dt version)
NVidia 515: 4.309 vs 4.095, or 1.05:1 (caused by changing distro + dt version)

Now, if you fix the distro and darktable version, and vary the NVidia driver version, you get:
Kubuntu + 4.1.0+387: 4.102 vs 4.075 (slowest/fastest runs): 1.007:1 (caused by NVidia version change)
Manjaro + 4.1.0+391: 4.392 vs 4.309: 1.019:1 (caused by NVidia version change)

To me, that looks like the Linux + dt version means much more difference (5-7%) than the NVidia version (0.7 - 1.9%). Actually, a measurement difference in the range of 1% could easily be ‘noise’ (caused by other software running during the measurement, like cron jobs and the like).

Plus, if you really want to benchmark, export a given set of pictures to produce longer runs and reduce measurement noise. Be sure to clear the darktable mipmap cache before each batch export, and use the same configuration, of course. And vary one parameter at a time.

Lawrence37 · September 18, 2022, 1:46am

This issue may explain some of the performance difference. There is a problem with recent versions of the GCC compiler that we work around by disabling an optimization. That results in a performance drop in some places (capture sharpening, for example). The Windows build uses an up-to-date version of GCC (currently 12.2.0) to compile RawTherapee and is affected. The AppImage build process runs on Ubuntu 18.04 which has the much older GCC 7.5.0 and is not affected.

Thanatomanic · September 18, 2022, 5:11am

Apart from the issue Lawrence linked to, this also might be reason why the file browser can be slower. Processing and adjusting sliders shouldn’t be affected by this.

betazoid · September 18, 2022, 7:12am

I will compile on Arch in a minute. Arch shoul use the newest compiler.

betazoid · September 18, 2022, 8:41am

Just built RT on Garuda. Slower than the Appimage but not nearly as slow as Windows. Jumping to the next raw takes approximately 1.85 sec.
It’s clear that it’s capture sharpening that slows down the process (the c.sh. bar is slower in the bottom left corner).

Going to compile on Windows now.

betazoid · September 18, 2022, 9:28am

RT compiled on Windows is as fast as compiled on Arch/Garuda

kofa · September 18, 2022, 9:47am

Good job with the investigation, Anna!

betazoid · September 18, 2022, 9:52am

I see the devs have already done something:
Set -ffp-contract=off compiler flag for GCC >= 11 (#6384) by Lawrence37 · Pull Request #6583 · Beep6581/RawTherapee · GitHub - I guess that’s what I have tested now? So if I had tested it yesterday it would have been slower…

edit: apparently it’s not merged yet

betazoid · September 18, 2022, 10:38am

@Lawrence37 's fork seems to be a little faster than dev on Windows, maybe even faster than the Appimage, jumping to the next raw takes 1.5-1.6 secs, maybe sometimes 1.4
Going to compile on Arch…

by the way, there is a mistake in the compiling instructions for Windows, it should be git clone https:// not git clone git://

betazoid · September 18, 2022, 11:22am

just compiled @Lawrence37 's version, seems to be as fast as the Appimage