ART processes images 3-4 times slower than RawTherapee for some reason

Nor for me, but my laptop isn’t a race horse so that’s just a given. ART is still functionally a little more responsive for me than, e.g., darktable, on the same computer. As I indicated before, I get NR (and / or sometimes sharpening) optimized then disable it until export time.

Could it be related to multithreading implementation? One thing I noticed (a long time ago) is that on Windows ART process spawns a lot of threads, and respawns a lot of them when any processing happens.

For example, I start ART (latest) and open one photo, and I see 1035 threads in ART process in task manager. If I set 8 threads in settings (instead of 0 on my 32 threaded cpu), I see 300 ART threads in task manager. When any processing happens (zooming, scrolling, etc.), a lot of threads gets destroyed and created again.

I guess that’s a specific of openmp implementation in mingw, but still spawning 1000+ threads or respawning a lot of them every time when any processing happens doesn’t sound right, as spawning threads on windows is expensive and slow, and also it’s an unnecessary load on the OS scheduler.

In any case, ART is still very fast for me, faster than other similar software (including paid ones). I don’t complain, rather I’m curious if it can be improved and if I can help with it.

For what it’s worth, on my AMD Ryzen 5600 8-core CPU (latest stable ART, 16 GB memory, Windows 11) I see about 250 threads after launch and opening one 24MP image. Zoom, unzoom, add a brush mask, etc., and it settles around 265 threads. Never have seen it go higher and the thread count stays pretty level no matter what I do (so far at least). Memory usage settles about 725 MB with IO at about 6K reads, 1.6K writes.

That’s kinda empirical, but just offering it for comparison.

Yes, it doesn’t change much and settles at some number, but in Process Hacker (advanced task manager) I see individual threads of ART process getting destroyed and created a lot, which isn’t a good behavior.

I took a quick look at source code and I see rather advanced openmp usage (not just parallel for). I might try to experiment with custom threadpool and lightweight job system if I find time, though it might be too much of changes for too little of profit (or no profit at all :smiley: as openmp is mature and optimized already, especially good implementations of it).

I normally use plain and simple omp pragmas, because I’m far from an openmp expert. Whatever advanced patterns you are seeing are most likely coming from @heckflosse and inherited from RT. Fwiw :slight_smile:

Hi Alberto

I just tried to build with the fast float plugin with no success.
As I am a little dense in those matters, could you provide a how-to. It could be some bash script.
Thanks a lot to help.

1 Like

Hello,

cannot contribute directly since I do not use ART.

I use exiftool in a macro I had writte for imageJ and I realized that sometimes this macro is extremely (really) slow. If I copy my stuff to another disk, it works fine. Phil Harvey could not find an explanation for this behaviour.

From monitoring it is clear that exiftool is the problem. I am using WIndows.

Hermann-Josef

Hi,
Thanks for the comment. I forgot to say that I also reworked the exiftool backend, now it uses a long-running process instead of spawning a new one at each invocation. This should make a visible difference especially on windows. However, exiftool is used only as a fallback if exiv2 fails, so if you have a camera for which exiv2 works fine you won’t see any difference.
As for the slow behaviour of exiftool, I think maybe antivirus software and the like could be a culprit?

Unfortunately the msys version of lcms2 doesn’t include the fast float plugin, so you have to build it from source

I’m currently looking into enabling lcms2_fast_float for the ArchLinux AUR builds and stumbled upon the issue that the fast_float plugin is not included in the default package.
As far as i can tell this is due to a licence change (MIT to GPL3) when enabling the fast_float plugin.

I may end up in creating a separate lcms2 package with fast_float enabled in the ArchLinux AUR. As ART itself is GPL3 as well this should be no problem with regards to licencing requirements.

update:
works now for art-rawconverter-git :slight_smile:

No problem with that, but what are the options required to build the plugin?

https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=lcms2-ff-git#n30

You need to add the --with-fastfloat flag.

Let’s try again – another test for windows users.
https://drive.google.com/file/d/1iRpy86TqTGsb5FZ4uvC0SuqVWel_-zEB/view?usp=share_link

Thanks in advance!

1 Like

Wow, what a difference. Whatever you did makes a big difference; ART is very snappy now!

1 Like

Great, thanks for testing!

Mostly, I finally read some of the openmp docs :slight_smile:

1 Like

LOL. Like they say, when all else fails read the directions :sweat_smile:

Thanks for your work, ART is a great piece of software I use a lot.

Ed

:man_facepalming: :man_facepalming: I must confess something. :man_facepalming: :man_facepalming: :man_shrugging:

The most slowdown in my video was caused by Transform > Geometry > Perspective beeing activated in my default profile (for whatever reason :upside_down_face:[in fact, there is no reason, I have never used it :man_shrugging:]). Allthough nothing has been set in the module, beeing activated alone significantly slows down the zoom/pan thing.

grafik

The less embarrassing news is, there’s a noticeable speedup in the NR chroma auto setting. :+1:

ah ok. That’s expected though. In principle, ART should compute the smallest region of interest (roi) that is needed to perform the perspective correction. The current pipeline however is a bit rigid in this respect, and doesn’t easily support the dynamic resizing of the roi, so the code is conservative and uses a “safe” roi. Which means that simply activating the module will cause ART to process a larger portion of the image, and not just what you see on the screen. So it will be slower… not something I am likely to change in the short term though, sorry.

Yes, it does seem faster. I can turn Noise Reduction on and there’s less lag on my system. Definitely appears to be an improvement. Excellent!

I noticed one functionally inconsequential (cosmetic-only) item in this version I’ve not seen before. When the File Browser reads a directory there’s sometimes a small vertical progress bar that appears just above the preferences button on the lower left, which I assume has always been there. That bar is a couple of pixels too wide for the space it’s in, so when it appears everything to the right of it (i.e., almost the entire screen) gets momentarily shifted right and then back by a few pixels to make room for it. The result is the screen “jumps” at times when changing folders.

I was able to catch it in a screenshot and it’s indeed just a tad too wide:

image

Zoomed-in 400% for clarity:

image

I added the colored boxes to clarify the difference. The red box is the width with the progress bar visible; the yellow box was drawn on another screenshot without it and pasted in for comparison.

Definitely a small thing, but it causes a very visible effect.

Thanks.

Looks normal on my side.

www