Just a thought, was 4.2 built using GTK2 (5.0 being GTK3…) ?
That’s an optimization level.
Yes, but that does not make a difference for processing speed. On gtk3 startup of rt maybe a bit slower but not processing of images
So it sounds like the next step is to ask Dariusz if he can re-build with “03” set, would you say?
At least we should ask him about the build flags he uses to build rt…
Ok, I’ll post something. A big Thankyou Ingo for the help tonight. Now I must go to bed!
Hi,
I’m not a RawTherapee developer, but I agree that you should investigate why there is no -O3 in your build flags. I just did a quick test with your pp3 on my machine, here are some results:
Build type: Release
Build flags: -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas
Execution time with the above:
real 54.21
user 205.26
sys 0.80
Build type: Release
Build flags: -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG
Execution time with the above:
real 11.82
user 42.21
sys 1.22
FWIW, I also use -ffast-math to get a bit more performance (especially on older hardware):
Build type: Release
Build flags: -Wno-deprecated -std=gnu++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG -ffast-math
execution time:
real 10.85
user 38.20
sys 1.27
not a big deal on my newest (work) laptop, but still nice. On my home machine the difference is more noticeable.
Thanks @agriggio, this is great news. I don’t know exactly what “real” and “user” are, but it’s clearly about 5x faster on your system with 03 set! Last night when I was baffled at 03, I thought the quotes / graphic bits in posts were just images, I didn’t realise the scroll bar worked! - so I can now see the 03 amongst the flags…(doh)
@Dariusz_Duma, Request for Help Please !! Is there any chance you might be able to re-build the download for Ubuntu and RT5 with 03 included? please please. This would deliver a huge performance increase for many or all such users, surely? Perhaps there are other settings that might benefit from reviewing. Hoping you can help.
Using -ffast-math
is dangerous. Ingo also mentioned that once, if I recall correctly, but I can’t find the link I have in mind.
Best,
Flössie
I’m aware of the dangers in general, but do you have any specific example for RawTherapee? I would especially be interested if the flag results in some sub-optimal output (e.g. causes some artifacts).
Also, I should clarify that I’m not recommending to build with -ffast-math by default, I’m only suggesting it as something worth trying to get some extra speed. Personally, I haven’t experienced any issues with RT since I moved to 64 bit, and I’m using -ffast-math on both my machines. But maybe I’m just lucky
I noticed that, and that’s why I set -ffast-math only in RTENGINE_CXX_FLAGS.
I did it out of curiosity at the beginning, but since I didn’t notice anything strange, I left it on…
noted, thanks for the pointer!
@agriggio If you want more speed for free, try building with LTO enabled. Last time I tried it (a year ago or so) it made a difference.
Thanks for the tip. I just tried, but I get a bunch of undefined references when linking (I simply set WITH_LTO=ON, is there anything else needed?)
Yes: In addition to -DWITH_LTO=ON
add -DCMAKE_AR=/usr/bin/gcc-ar
and -DCMAKE_RANLIB=/usr/bin/gcc-ranlib
to the cmake line.
HTH,
Flössie
Thanks, works now!
Don’t know about Ubuntu, but using Debian 8 or the Debian 9RC any version built on GTK3 is slow. The Gtk2 builds that allow using the system’s own theme in settings, work nice and snappy; no real speed difference between my 2GB RAM laptop and my 16GB RAM tower (obviously exports will be faster on the tower through).
I assume it must be because my debian hasn’t yet propperly adopted gtk3 yet?
I’d be interested in what goes faster, and how much faster it goes please.
goes about half a second faster (with your pp3). Nice considering that it is “for free”, but it won’t change your day…