darktable performance regression with kernel 6.7 and newer.

I don’t think that’s true at all and we have no way of knowing how people use DT. Personally I was a CPU only user until spectre/meltdown at 25% of my CPU speed. Before that it was acceptable and I was processing 45mpix images.

There is no GPU-gate to what is and isn’t serioius.

Sorry to assume, I know RT doesn’t use the GPU and it works not too badly and other editors may not but a GPU is pretty universal in some form I think for most people editing photos unless there are financial issues…its generally 5 to 10 x faster on even what are now modest GPU’s. The fact that one module is impacted by a kernel update and the regression might revert just as fast as soon at that updates would not have me setting off on a game of whack a mole…but everyone has their itch to scratch… I wonder if any of the other wavelet modules stand out. I think this is an old xmp …I wonder if denoise profiled was added and diffuse or sharpen or maybe retouch comes to mind if there would be something revealed by anything doing the wavelets computations or maybe not…

It is normal for you perhaps, but again we have no way of knowing about the hardware of the userbase at large.

I wish I knew more about coding. Apology upfront. I tried to read through the source code though as there are quite a lot of comments and it seems like there are options in the code to invoke versions of the module… some references to new and old and some to the new global tiling vs single module tiling and V1 and V2… like this small snippet…

image

I was trying to remember/understand how compatibilty works if an older version xmp was including a module with a long history does that control some of the code options used as part of that process??

I guess I was just wondering if this xmp is old enough that it triggers the use of the older code which runs maybe as expected on the “older"kernel” but maybe not so well on a newer one.

It would I guess be easy to test this I would just have to set-up a linux box…

And all that might just be nonsense from someone curious but not clear on how the code works…

but it doesn’t seem to be a regression in darktable, but a regression in the linux kernel, as OP is changing kernel versions, not darktable versions or xmp versions, and is seeing a difference in speed.

Its version a 4 (at least the one I downloaded from @kofa’s link) and the current up to date ones are reporting version 5 but thats not likely different enough to run an older set of code in the CE module. I thought it might be much older like version 2 or something of the xmp and my thinking was each kernel is seeing the same code but maybe an older version dictated by the older xmp and that was working slightly better on the older kernel… but it would seem that is most certainly not the case…

Could be a problem with only Ryzen then?
I read the chsngelog of 6.7 but could not spot anything obvious (that does not mean anything though). The memory management seems to be a huge change though. But I have no idea into which direction to search…
If someone has an idea, I can test it with my above mentioned versions!

I just tested on my laptop. I can confirm that I do not see the issue on my laptop either. It is a Lenovo X1 carbon with Intel Core i5-8250U.

I tested kernel 6.6.54 and 6.10.3. In both cases atrous takes ca. 3.4 s and pixel pipeline takes ca. 18 s.

On the other hand I can reproduce the performance degradation on two Ryzen PCs.

Looks like the issue is AMD related.

3 Likes

I got instructions from the kernel devs how to do a git bisect.

https://docs.kernel.org/admin-guide/bug-bisect.html

I just started that process this morning. This will take a while. git estimates that I need to build and test approx. 13 kernels to find the bad commit. Lets see how that goes.

5 Likes

I completed the git bisect and found the commit causing the issue. When I revert that commit for kernel 6.10.14 for example, the pixel pipeline time consumption goes down from 4.7 s to 3.8 s. That is significant.

You can find all the details in the kernel issue tracker:
https://bugzilla.kernel.org/show_bug.cgi?id=219366

10 Likes

Thank you for the effort you’ve put into this investigation!

4 Likes

Your report even made it into the patch notes :smiley:

7 Likes

The Intel kernel test robot reported the 3888.9% improvement with its “will-it-scale.per_process_ops” scalability test case running on an Intel Xeon Platinum (Cooper Lake) test server.

So it’s not a AMD only issue then?

1 Like

An article (in German) explains some of the backgrounds: Entwickler verneint 4000 Prozent schnelleren Linux-Kernel | heise online
There are some links to original posts in the article.

It appears that the Intel benchmark is BS and not representative. The bug seems to be only affect AMD. Again, the darktable bug report is mentioned explicitly :slight_smile:

2 Likes