dt 2.7 still facing some performance issues

Hey y’all,

it seems, I am the one who is “on” for performance issues :slight_smile:

My journey started here and that one we improved a lot, thanks to @Edgardo_Hoszowski and others, involved there.

I saw Aurelian’s post time ago, challenging, that users may demand too easily performance updates without recognizing the incredible amount of work necessary to improve something, which in the worst case is unspecific…

Currently what moves me is maybe spread over several issues on github. The one motivates me tonight to write about, is this one:

https://github.com/darktable-org/darktable/issues/2514#issuecomment-498410996

Additional thoughts

  • I use gentoo and keep it up to date by cron-scripts with not many but quite some development packages installed.
  • The major compile optimizations are, it is “RELEASE”, I assume (no debug code implemented) and in my /etc/make.conf I have the following:
    CFLAGS="-march=nocona -O2 -mtune=generic -mssse3 -mfpmath=sse -pipe"
    CXXFLAGS="${CFLAGS}
    Not that aggressive
  • I actually don’t know, how dt compiles with its build.sh and how to smartly implement tweaks, which are not lost after a new git pull
  • Recently I upgrade from an old core i7 to a core i9 9900k and firstly had a big WOW! on dt 2.6.2 but that somehow got reverted a bit, when I start using dt-27, so that brings me on it.
  • As @Claes sometimes can reproduce my issues on his Ryzon, I get the feeling, the performance issue might be more visible on the faster machines, as they are supposed to be lag-free where others (please don’t take it arrogant) may add another 0.2s to any already existing waiting time. So maybe it is not cycle time but fix absolute times… Sorry professionals, if that sound silly, I try my best to describe what I experience…
  • please see, what I report in aforementioned issue about the findings on nvidia-smi

Still I have some strange multiplication of module entries in history stack, just because I touch the same module several times. That might be related, it might be not. However, it reminds me on another issue #2420 I reported time ago which unfortunately is closed, even I could see from Bruce’s video, that it still can occur.

I love dt and hope with this I can be a little part of making it further better…

Sincerely
Axel

1 Like

I’ve this too, for several modules, if not all. But after a reset of history stack the multiplication is off.
If I understand properly all the steps in the stack are executed, producing some expected delay in case of duplicated steps …

I fiddled around a bit more…

Currently I am building my dt 2.7 with the below script. (nocona due to “icecream” a distcc alike compiler cluster, but I will change to native now…)

That improved many things, however, I am still facing an issue, that most of the time i touch a module for the first time, especially WB, it is sluggish in the first place. When I relieve the slider and touch it again, suddenly the reaction is way more fluent.

nvidia-smi also shows that with a higher GPU utiization.

Anyone can confirm that?

#!/bin/sh

PATH_SCRIPT="/root/darktable"
CFLAGS_SCRIPT="-march=nocona -O2 -mtune=native -mssse3 -mfpmath=sse -pipe"
CXXFLAGS_SCRIPT="${CFLAGS_SCRIPT}"
BUILD="Release"
MAKE="-j15"

cd $PATH_SCRIPT
git pull
git submodule update

cd $PATH_SCRIPT/build
cmake -DCMAKE_BUILD_TYPE=$BUILD CFLAGS=$CFLAGS_SCRIPT CXXFLAGS=$CXXFLAGS_SCRIPT ..
make $MAKE && make install

exit 0

you are aiming for maximum performance and then optimize for an old Pentium 4 CPU???

:slight_smile:
As I said above, I did that due to compiler clustering. With native the code cannot run on other CPUs.

  • I changed to -march=native -O3 now and omitted -gddb (yes, a bit faster, but less than I thought)
  • this was used just the recent few days, beforehand I used build.sh
  • regardless which way, I have that lag, when I touch WB for the first time and fluent after 2nd touch…

I put things togehter in a new issue #2724 and closed above one. I wonder, anybody can reproduce…