Darktable troubleshooting cache speed

Hi All,
I am trying to troubleshoot cache reading speed in full screen
Darktable 4.8.1 running from flatpak

System info is below

OS: Pop!_OS 22.04 LTS x86_64 
Host: Serval WS serw12 
Kernel: 6.9.3-76060903-generic 
Uptime: 21 mins 
Packages: 3189 (dpkg), 86 (flatpak) 
Shell: bash 5.1.16 
Resolution: 2560x1440 
DE: GNOME 
WM: Mutter 
WM Theme: Pop 
Theme: Pop [GTK2/3] 
Icons: Pop [GTK2/3] 
Terminal: gnome-terminal 
CPU: AMD Ryzen 9 3900 (24) @ 3.100GHz 
GPU: NVIDIA GeForce RTX 2070 Mobile / Max-Q Refresh 
Memory: 4342MiB / 64194MiB 

The issue is:
When I use full screen preview in light table navigating to the next image takes a few seconds. For some reason I am observing it more in a non processed images. I don’t think I can observe it with processed images.

I made sure that cache is generated up to level 5 for all images. Then tried to assess the performance.

flatpak run org.darktable.Darktable -d cache -d perf -d verbose

The closest that I am getting to is that sometimes the system switches to CPU - line 83.4182

    81.9134 [mipmap_cache] grab mip 5 for image 218987 from disk cache
    81.9149 [mipmap_cache] thumbs fill 712.82/4012.16 MB (17.77%)
    81.9149 [mipmap_cache] float fill 0/16 slots (0.00%)
    81.9149 [mipmap_cache] full  fill 2/16 slots (12.50%)
    81.9149 [mipmap_cache] level | near match | miss | stand-in | fetches | total rq
    81.9149 [mipmap_cache] thumb |  52.38% |  35.03% | 100.00%  |   0.00% | 100.00%
    81.9149 [mipmap_cache] float |   -nan% |   -nan% |   0.00%  |   0.00% |   0.00%
    81.9149 [mipmap_cache] full  |   -nan% |   -nan% |   0.00%  | 100.00% |   0.00%


    81.9369 [mipmap_cache] grab mip 5 for image 218988 from disk cache
    83.4182 [dt_dev_load_raw] loading the image. took 1.499 secs (0.612 CPU)
    83.5203 [mipmap_cache] thumbs fill 728.45/4012.16 MB (18.16%)
    83.5203 [mipmap_cache] float fill 0/16 slots (0.00%)
    83.5203 [mipmap_cache] full  fill 3/16 slots (18.75%)
    83.5203 [mipmap_cache] level | near match | miss | stand-in | fetches | total rq
    83.5203 [mipmap_cache] thumb |  52.20% |  34.92% | 100.00%  |   0.00% | 100.00%
    83.5203 [mipmap_cache] float |   -nan% |   -nan% |   0.00%  |   0.00% |   0.00%
    83.5203 [mipmap_cache] full  |   -nan% |   -nan% |   0.00%  | 100.00% |   0.00%


    83.5481 [mipmap_cache] grab mip 5 for image 218989 from disk cache
    83.5497 [mipmap_cache] thumbs fill 744.07/4012.16 MB (18.55%)
    83.5497 [mipmap_cache] float fill 0/16 slots (0.00%)
    83.5497 [mipmap_cache] full  fill 3/16 slots (18.75%)
    83.5497 [mipmap_cache] level | near match | miss | stand-in | fetches | total rq
    83.5497 [mipmap_cache] thumb |  52.20% |  34.92% | 100.00%  |   0.00% | 100.00%
    83.5497 [mipmap_cache] float |   -nan% |   -nan% |   0.00%  |   0.00% |   0.00%
    83.5497 [mipmap_cache] full  |   -nan% |   -nan% |   0.00%  | 100.00% |   0.00%

I tried a second time by changing the view to 1 up (not full screen) - the system does not appear to have issues with level 4 images. As soon as I switched to full screen - again line 73.6180 - tries to use CPU

    61.2670 [mipmap_cache] grab mip 4 for image 219096 from disk cache
    61.3083 [mipmap_cache] grab mip 4 for image 219097 from disk cache
    61.3471 [mipmap_cache] grab mip 4 for image 219098 from disk cache
    61.4536 [mipmap_cache] grab mip 4 for image 219099 from disk cache
    61.9182 [mipmap_cache] grab mip 4 for image 219100 from disk cache
    61.9491 [mipmap_cache] grab mip 4 for image 219101 from disk cache
    62.0062 [mipmap_cache] grab mip 4 for image 219102 from disk cache
    62.1014 [mipmap_cache] grab mip 4 for image 219103 from disk cache
    62.2291 [mipmap_cache] grab mip 4 for image 219104 from disk cache
    62.2749 [mipmap_cache] grab mip 4 for image 219105 from disk cache
    62.8387 [mipmap_cache] grab mip 4 for image 219106 from disk cache
    62.8609 [mipmap_cache] grab mip 4 for image 219107 from disk cache
    62.9012 [mipmap_cache] grab mip 4 for image 219108 from disk cache
    62.9570 [mipmap_cache] grab mip 4 for image 219109 from disk cache
    63.0686 [mipmap_cache] grab mip 4 for image 219110 from disk cache
    66.1532 [mipmap_cache] grab mip 4 for image 219083 from disk cache
    71.7992 [mipmap_cache] thumbs fill 1037.12/4012.16 MB (25.85%)
    71.7993 [mipmap_cache] float fill 0/16 slots (0.00%)
    71.7993 [mipmap_cache] full  fill 0/16 slots (0.00%)
    71.7993 [mipmap_cache] level | near match | miss | stand-in | fetches | total rq
    71.7993 [mipmap_cache] thumb |  57.00% |  57.00% |   -nan%  |   -nan% | 100.00%
    71.7993 [mipmap_cache] float |   -nan% |   -nan% |   -nan%  |   -nan% |   0.00%
    71.7993 [mipmap_cache] full  |   -nan% |   -nan% |   -nan%  |   -nan% |   0.00%


    71.8326 [mipmap_cache] grab mip 5 for image 219084 from disk cache
    71.8574 [mipmap_cache] grab mip 5 for image 219082 from disk cache
    73.6180 [dt_dev_load_raw] loading the image. took 1.804 secs (0.682 CPU)
    73.7491 [mipmap_cache] grab mip 5 for image 219083 from disk cache
    78.3090 [thumb crawler] max_mip=5, 0 thumbs updated, 0 not found, all done.
    81.3173 [thumb crawler] max_mip=5, 0 thumbs updated, 0 not found, all done.
    84.3247 [thumb crawler] max_mip=5, 0 thumbs updated, 0 not found, all done.
    87.0860 [mipmap_cache] thumbs fill 1083.99/4012.16 MB (27.02%)
    87.0860 [mipmap_cache] float fill 0/16 slots (0.00%)
    87.0860 [mipmap_cache] full  fill 1/16 slots (6.25%)
    87.0860 [mipmap_cache] level | near match | miss | stand-in | fetches | total rq
    87.0861 [mipmap_cache] thumb |  56.94% |  56.46% | 100.00%  |   0.00% | 100.00%
    87.0861 [mipmap_cache] float |   -nan% |   -nan% |   0.00%  |   0.00% |   0.00%
    87.0861 [mipmap_cache] full  |   -nan% |   -nan% |   0.00%  | 100.00% |   0.00%

I did search the forum and stumbled on the following thread

These are my changes to darktablerc

From
opencl_device_priority=*/!0,*/*/*
to
opencl_device_priority=+0/*/+0/*/*


from
opencl_mandatory_timeout=200
to
opencl_mandatory_timeout=20000


from
resource_large=700 16 128 900
to
resource_large=700 64 128 900

Including darktablerc here
darktablerc.zip (14.0 KB)

The changes did improve the overall experience but only with the processing window (dartkable). Also - the changes were done before capturing the above messages.

Can you help me please? This issue has been around for a while for me and I am not sure what else to do.
I am mostly struggling with it when I have a very big amount of pictures to sort (some times a few thousands).

At cache level 5, there is probably no embedded preview image (.cr3 files have embedded image up to size 3). So to generate the full preview the raw has to be opened and processed, which takes approximately 1 second.

My solution was a lua script that runs on import and generates the lilghttable cache (size 3) and the full preview cache (size 6). The lighttable cache generates at 100+ images/sec. The full preview cache generates at ~3 images/sec.

My normal import size is 1000+ images. I import and then wait 10 - 15 minutes while the cache generates. After the cache is generated, I can cull at full speed in full size preview. You can still use darktable while it’s generating the cache, so you could probably let it get about halfway done generating the full size previews, start culling and not be able to catch up.

3 Likes

What are your settings for thumbnails…maybe it lies in there…

I used the generate cache function

flatpak run --command=darktable-generate-cache org.darktable.Darktable -m 5

and verified the cache is generated. I expected that this should have been enough.

image

Here are the settings. I tried changing to 4k and WQXGA. For now I am only observing slowdown on full screen.

I tried 1 up as a workaround but while the display is much faster I am missing the tagging sometimes.
Last night I experimented with adding a bit of a exposure - just so the image is registered as a modified. To some extend it helps. Not sure what I am missing.

I think I am getting closer to understanding what is happening (but I don’t know how to fix it)
My pictures are on a nas drive (a much slower nas drive).
When I am navigating using the 1 up option (or level 4 cache) - DT reads the cached images.
This is why there is hardly any activity on the network card.

When I switch to a full screen preview the network activity picks up significantly.

That means - DT reads the full high res raw files to load the preview and does not rely on the generated cache.

Is this behavior configurable or it should be handled on github?
I did report it before

@hannoschwalm what is your opinion?

Reading this


my understanding is that if the needed thumbnail is higher than 4k (in my case) the full image will be used for rendering but if it is lower - it should not.
The level 5 cache is lower than 4k and I am on a 2k monitor.
What am I missing or misunderstanding?

That’s not correct.

It depends on the size of the embedded preview in the raw file. I’ll use my Canon EOS R7 for an example. The image is 32MP. The raw has 2 embedded previews. One is cache size 3 and the other is cache size 1 (IIRC). So, if I want a HD (cache size 4) or larger full preview, the only way to get it is open the raw file and process the raw data to generate the full size preview.

The other problem is the crawler. It tries to generate cache images for every image in the database, but I believe that it starts at the first record in the database (oldest image) and works toward the newest. So the images you want the cache for are likely to be the last ones that it gets generated for. It also only runs when darktable is idle.

So - with my current settings - high quality render should be used above WQXGA (2k)
image

I’ve tried changing use RAW to “always” and ignore the .JPG altogether and I can do it again but to my best memory the issue remained. Do you think this would work? Ignoring the embedded .JPG should completely eliminate them as a cause of an issue.

The crawler is active but I already force generation of all thumbnails with

flatpak run --command=darktable-generate-cache org.darktable.Darktable -m 5

and it completed entirely.
I remember you posted https://github.com/darktable-org/darktable/files/13800069/import_cache.zip
The reason why I haven’t tried it as I tend to force render the thumbnails.

Is the same script working on 4.8.1 or it updated?

Must be frustrating…hope you sort it out. I feel like if you have generated the thumbs then it should be fast…

It is very frustrating - especially when there are thousands of pictures to sort.
I changed the settings to use the RAW file (never the embedded .JPG) but this did not help.

I also found previous discussion how to extract the embedded .jpg

I am using Sony a7cII (33 MP). Extracting all embedded previews shows 3 of them.
20240518_172005_08011.ARW (35.2 MB)
20240518_172005_08011-preview1


The 3rd one is actually with very big resolution 7008 Ă— 4672 pixels

That means - for a 2k screen these embedded previews should be more than enough

I have no idea how to explain it but multiple times when I am testing - files that are processed do not appear to re render. Files that are not processed - do re render.

Github issue is this one

1 Like