darktable: periodic database maintenance?

Hi.
I wonder if it is recommended to do some database cleanup tasks from time to time.
I did a quick research and found that there are statements like ANALYZE, OPTIMIZE and VACUUM that are meant to do such tasks.

1 Like

Huh, this is interesting idea!
I had a brief look at database.c file in darktable src just to see what’s there and what’s not :wink:

According to sqlite manual - https://www.sqlite.org/lang_analyze.html you should run PRAGMA optimize before closing database connection or every few hours for sqlite to properly help query scheduler with ANALYZE data. I don’t see it being done in current src so it might be good material for easy pull request? This would probably negate the need for any ANALYZE calls

VACUUM is only needed in case the DB file is rather large and full of empty space, so for example when you had huge library but cleared most of it. I don’t think I’d do it, except in rare cases.

Last thing - I don’t think sqlite has OPTIMIZE call - https://www.sqlite.org/lang.html but I might be wrong…

Maybe it is a PRAGMA function?

https://www.sqlite.org/pragma.html#syntax

The last changes of the database structure in master increased the size of the database by a factor of more than 2. Running a VACUUM did the job and reduced the size to approximately its previous value.

I already mentioned it here:

:slight_smile:

This seems like a potential place for VACUUM call in darktable code - right after database structure changes on pre-existing database - Another posible PR, but not so easy :wink:

Lemme explain myself then: If you’d use script that purges non existing images, tags etc or do changes like @pehar mentioned or do loads of things that impact size of database, then sure, VACUUM is the right way. Maybe even deserves place in /tools/ dir as a recommended script?

I’ve done 1st part in PR #4303 - This adds VACUUM and ANALYZE to tools doing purging and most likely to cause huge db size shifts.

The 2nd part is more contrived and I’d like some input on that, prefferably from advanced users and/or other devs (@Pascal_Obry ?). One of better suggested ways by sqlite is doing PRAGMA optimize call either right before closing db or once every couple hours. That’s usually no-op, but can slow down closing time in case sqlite decides it’s actually beneficial to run ANALYZE. Another suggestion is calculated VACUUM call depending on theoretical calculation of possible db fragmentation, however VACUUM call is expensive one and requires at least 2x space as taken by db (data+library) on disk to perform. I had 2 ideas - introduce config setting that would specify when to call VACUUM (eg once free_pages reach over 25% of size of database) and perform VACUUM before dt closing once this condition is reached. And 2nd - put button beneath miscellaneous section in core options tab in darktable preferences window with simple cleanup db name :wink:

btw: changes in tools managed to slice my db size in half! So I did need to perform maintenance :wink:

1 Like

What about adding a button somewhere in the main program? Running the scripts appears less obvious under Windows, plus there will be people who don’t go looking for the scripts to run.

Given that there are expensive tasks to run, it would possibly be better to offer the user the choice of when to run them, but without the need to run a script.

You need to be careful when attaching these scripts to a button in the main program!

The path to library.db, data.db and the cache directory are hard-coded to the default locations in those scripts. If, like me, you run multiple instances, each having their own dedicated config and cache directory those script will not work as intended.

Admittedly this will not be a problem for the fast majority of the users, but it is something that one should be made aware of if these scripts are linked to one (or more) buttons.

It would need to be one of those big red buttons with a fold-down cover, like for the ejector seats :slight_smile:

Does the script verify that no instances of dt are running?

This unfolded in an unexpected way to me, but , hey, thanks!

I believe if the database is fragmented prior to this PR, then VACUUM/ANALYZE should be applied regardless the user is purging files or not, correct?
Wouldn’t it be interesting to have another script just for that? It would be basically this, right?

image

Yes. I plan to add a bit of code to database.c and other places to help with db overall.

Yes, I do plan on doing just that. Most preferably in core options window.

Won’t be scripts, will be direct sqlite3 calls in C code :slight_smile:

2 Likes

Glad to read that it is going to be implemented this way. I assumed something like that might happen, but you know what they say about assumptions…

For the time being I made a script that fixes the scripts :sunglasses:

Ohhh… Share, plz?

Most preferably in core options window.

So would that be an auto-reset option… you click “clean up DB” in core options, nothing happens until you close and then it runs, and the tick-box is then cleared? Or something else…

Trully… just a button. you click on it and it does VACUUM; ANALYZE on db :slight_smile:

I’ll try to spend more time on something even less obtrusive than that and more user friendly (for those users who don’t want no stinkin “maintaince db” button in their settings) :stuck_out_tongue:

1 Like

Excellent :slight_smile:

I’ve send you a message.

1 Like