RT5 and speed query

Just a thought, was 4.2 built using GTK2 (5.0 being GTK3…) ?

That’s an optimization level.

Yes, but that does not make a difference for processing speed. On gtk3 startup of rt maybe a bit slower but not processing of images

So it sounds like the next step is to ask Dariusz if he can re-build with “03” set, would you say?

At least we should ask him about the build flags he uses to build rt…

Ok, I’ll post something. A big Thankyou Ingo for the help tonight. Now I must go to bed!

1 Like

Hi,
I’m not a RawTherapee developer, but I agree that you should investigate why there is no -O3 in your build flags. I just did a quick test with your pp3 on my machine, here are some results:


Build type: Release
Build flags:  -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas

Execution time with the above:

real 54.21
user 205.26
sys 0.80

Build type: Release
Build flags:  -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG

Execution time with the above:

real 11.82
user 42.21
sys 1.22

FWIW, I also use -ffast-math to get a bit more performance (especially on older hardware):

Build type: Release
Build flags: -Wno-deprecated -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG -ffast-math

execution time:

real 10.85
user 38.20
sys 1.27

not a big deal on my newest (work) laptop, but still nice. On my home machine the difference is more noticeable.

Thanks @agriggio, this is great news. I don’t know exactly what “real” and “user” are, but it’s clearly about 5x faster on your system with 03 set! Last night when I was baffled at 03, I thought the quotes / graphic bits in posts were just images, I didn’t realise the scroll bar worked! - so I can now see the 03 amongst the flags…(doh)

@Dariusz_Duma, Request for Help Please !! Is there any chance you might be able to re-build the download for Ubuntu and RT5 with 03 included? please please. This would deliver a huge performance increase for many or all such users, surely? Perhaps there are other settings that might benefit from reviewing. Hoping you can help.

Using -ffast-math is dangerous. Ingo also mentioned that once, if I recall correctly, but I can’t find the link I have in mind.

Best,
Flössie

@floessie we should review this. Seems it does not work…

I’m aware of the dangers in general, but do you have any specific example for RawTherapee? I would especially be interested if the flag results in some sub-optimal output (e.g. causes some artifacts).

Also, I should clarify that I’m not recommending to build with -ffast-math by default, I’m only suggesting it as something worth trying to get some extra speed. Personally, I haven’t experienced any issues with RT since I moved to 64 bit, and I’m using -ffast-math on both my machines. But maybe I’m just lucky :slight_smile:

I noticed that, and that’s why I set -ffast-math only in RTENGINE_CXX_FLAGS.
I did it out of curiosity at the beginning, but since I didn’t notice anything strange, I left it on…

@agriggio you should not use Colour toning then

noted, thanks for the pointer!

@agriggio If you want more speed for free, try building with LTO enabled. Last time I tried it (a year ago or so) it made a difference.

Thanks for the tip. I just tried, but I get a bunch of undefined references when linking :frowning: (I simply set WITH_LTO=ON, is there anything else needed?)

Yes: In addition to -DWITH_LTO=ON add -DCMAKE_AR=/usr/bin/gcc-ar and -DCMAKE_RANLIB=/usr/bin/gcc-ranlib to the cmake line.

HTH,
Flössie

1 Like

Thanks, works now!

Don’t know about Ubuntu, but using Debian 8 or the Debian 9RC any version built on GTK3 is slow. The Gtk2 builds that allow using the system’s own theme in settings, work nice and snappy; no real speed difference between my 2GB RAM laptop and my 16GB RAM tower (obviously exports will be faster on the tower through).

I assume it must be because my debian hasn’t yet propperly adopted gtk3 yet?

I’d be interested in what goes faster, and how much faster it goes please.

goes about half a second faster (with your pp3). Nice considering that it is “for free”, but it won’t change your day…