I have a good PC (i9-14900 64GB RAM with NVIDIA RTX4070) on Ubuntu 24.04 but Darktable (4.8.1) takes around 3 seconds to go from one JPEG to another in the darkroom. I have been using Darktable since a very long time and have been upgrading regularly.
I have enabled the logs and I can see it sits on a single SQL query for a bit less than 3 seconds. Here is the line in question and the next one:
942.2825 [sql] /home/patrick/src/darktable/src/common/tags.c:725, function dt_tag_get_attached(): prepare "SELECT DISTINCT I.tagid, T.name, T.flags, T.synonyms, COUNT(DISTINCT I.imgid) AS inb FROM main.tagged_images AS I JOIN data.tags AS T ON T.id = I.tagid WHERE I.imgid IN (SELECT imgid FROM main.selected_images) AND T.id NOT IN memory.darktable_tags GROUP BY I.tagid ORDER by T.name"
944.9716 [sql] ...
OK, I have many images (11472), tagged_images (18391) and a lot of tags (32315) in my DB files. But this should not take that long.
I’ve tried to do this query in a shell (minus the memory.darktable_tags part) and it wasn’t that slow. So it’s maybe not this query…
After backing up my ~/.config/darktable files, I’ve tried to delete the contents of 2 suspect tables:
~/.config/darktable$ sqlite3 library.db
SQLite version 3.45.1 2024-01-30 16:01:20
Enter ".help" for usage hints.
sqlite> delete from tagged_images;
sqlite> vacuum;
~/.config/darktable$ sqlite3 data.db
SQLite version 3.45.1 2024-01-30 16:01:20
Enter ".help" for usage hints.
sqlite> delete from tags;
sqlite> vacuum;
I seem to still have all my collections and the application is now very responsive.
What are those tables for?
What do I loose by emptying them?
Does anybody else had this kind of performance problems?
Is there a less drastic solution to my problem?
I do agree, this is something like 30x the number I count for 39800 images, but I’m not specially good at tagging.
Now this is up to @pvalsecc to express himself about this so big number of tags.
And, as you can see below, the index on tags is UNIQUE on ‘name’ which is the name given to the tag…
sqlite> .schema tags
CREATE TABLE tags (id INTEGER PRIMARY KEY, name VARCHAR, synonyms VARCHAR, flags INTEGER);
CREATE UNIQUE INDEX tags_name_idx ON tags (name);
But is it realistic to have over 30k tags, or could there have been some duplication due to a bug?
Never did anything to add a tag manually apart from using the little cross for pictures I’m planning to delete. But then, those are deleted soon after.
Now this is up to @pvalsecc to express himself about this so big number of tags.
Looking at the content of the tags table, I see tons of records like that:
<gpx creator="Converted by fit2gpx, http://velo100.ru/garmin-fit-to-gpx from Device ID: 831 (COROS)" version="1.1" xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtrx="http://www.garmin.com/xmlschemas/GpxExtensions/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:gpxx="http://www.garmin.com/xmlschemas/WaypointExtension/v1" xmlns:nmea="http://trekbuddy.net/2009/01/gpx/nmea">|<trk>|<trkseg>|<trkpt lat="55.96558350138366" lon="-4.345451602712274">|<extensions>|<gpxtpx:TrackPointExtension>|<gpxtpx:course>3945.0</gpxtpx:course>
I guess those were created when I geotagged my pictures using GPX files:
sqlite> select count(*) from tags;
32315
sqlite> select count(*) from tags where name like '<GPX%';
32218
I’ve tried to delete only those weird GPX tags and yes, the perfs are back to something good. And I’ve checked, the pictures still show on the map at the correct location.
Must be a bug in the geotagging widget of the lighttable.
Probably yes, though it may be a feature request instead. It’s not a bug in the sense that darktable works as designed, creating tags from metadata. However, it should filter them, which would be a new feature.
I’ve already been geotagging photos in dt using GPX files and I find nothing comparable with these <gpx... entries in the tags table from data.db…
Instead the GPS longitude/latitude are stored in the XMP sidecar files.
Did you geotag your photos using dt or another tool?
Edit: the index was dropped on purpose, should have no negative impact; the column is already used as the 1st column of a multi-column index. I’ve closed the issue in Github.