As I try to re-import my photo for renaming all my photo file to unique name. I found that if my library growing large, the import time increase significantly.
When the library have 183,000 photo, 128 photo required 1m53s to import.
So, I remove the library.db to test an empty library, the same 128 photo just need 4s to import.
For 10K photo import, it will take about 2~3 hours to complete.
I am running Darktable 3.4 on Windows 10, AMD 2700X, 32G ram. The CPU / DISK usage keeping low during import.
This should/could be made faster. It’s very nasty and unforgiving combination of database transactions (so every select won’t get microlock, just whole import session could be done with one lock), thread locking (so if you want to do anything during import lock should be per imported image and released ASAP), cache locking in threading…
it can be helped but it requires loads of work in many parts of darktable handling, including suff like signal blocking/ignoring, thread synchronization etc.
How does adding filename as a second column to images_film_id_index affect import speed on large collections? Could one with such a collection give it a test (perhaps using sqlitebrowser)?
fwiw: with 185 images from cl as:
darktable --library :memory: /data/photos/group/*
takes approx 10 seconds
all raw and all with accompanying xmp files
openSUSE Tumbleweed
10 year old i7 970 36gb nvidia GTS 450
all files are on local system rotating rust
same start with library containing >189k image
approx 161 seconds
during the soccer season I normally work sets of 400-600 images, and it becomes much easier/quicker to work and export the images, upload them to my server, then import the worked images and accompanying xmp files into the library and later deleting the rejected shots.
also doing the original work on ssd before moving to rust.
and sorry for I guess spamming/multi-posts, am not really familiar with the user-interface, but I will learn
@fatman Thank you, 40s shaved, but still too much.
@johnny-bit No experience with PRs, alas. There are indeed a couple of more indices to alter or add. Here are mine, but since it was done one secondary machine, it was never tested against a larger collection.
The query in line 1410/1468 of image.c is another candidate. LIKE is bad. Could be solved by adding another column groupname to images containing only the filename without the extension, and then adding another appropriate index over all three columns.
Plus, collection.cline 1006: do wo really need LIKE in WHERE folder LIKE ... and why.
@fatman Do you get benefit from adding another index CREATE INDEX 'images_film_id_id_fn' ON 'images' ('film_id','id','filename');
No extra benefit. Seem it either use the images_film_id_index with filename added or images_film_id_id_fn for query
What I test:
1:53 (Original)
1:19 (images_film_id_id_fn)
1:19 (images_film_id_index with filename added)
1:19 (images_film_id_id_fn & images_film_id_index with filename added)
1:11(tagged_images_position_index & images_film_id_id_fn & images_film_id_index with filename added)
I take a look on the image.c, seem SELECT MAX(position) FROM tagged_images is required for a new tagged_images row, so I added this index.
Without index, the SELECT MAX(position) take 180ms, After index added, it take 4ms, my tagged_images have 417K rows.
Sorry for the late reply, I haven’t visited the forum in a while
There is no different after the that index images_filename_index_nc added.
BTW, I gave up Darktable to catalog my photo library. I tried to import 180K photo to Digikam, it took 2 hours and 10 minutes (Darktable need several days), and the tag management is much better.