Things are very slow for me on the database front. Latest digikam update from Fedora (6.1.0) updated the database schema and has been going for hours just to get to 86% for finding new items. (Earlier today it would keep getting stuck at 83%). Iām using internal mysql and right now I have 2 mysqld processes currently running (any way to figure out which is related to digikam?) taking 8-ish% CPU. Earlier today I ran digikam and it kept getting stuck at 83%, it was complaining about ancient files (like at least a decade old) that it should have scanned in the past. I added about 214 Canon RAW files, but thatās never made things this slow in the past.
For a while now the program has been unresponsive whenever writing metadata to files - like for minutes - I usually have a book open while Iām adding metadata.
I switched to mysql because I have hundreds of thousands of photos and videos and I thought that was supposed to be the faster format. Currently database is on spinning HD, not SSD, but I donāt want to try to move it at the very least until after itās done finding new items. Should i still be using MySQL or should it be SQlite?
My reaction is an outlier. But I rely on the file system for storing finding and labeling image groupings. Iām a retired programmer who specialized in database stuff: Oracle, Mysql, DB2 and Postgres and even XML databases like SleepyCat and the open source Exist. Actually I did a lot with XML databases. So my database reluctance has noting to do with skills.
I donāt want my image processing life cycle to involve a database any more than I want to drive an 18 wheeler to the corner store for milk and beer.
I use the darkroom half of Darktable religiously. With the lighttable half of Darktable I only use the export button. There are sophisticated file system directory structure ways of storing images, that are a lot more accessible and easier to navigate than embedded database software.
I donāt use digikam, but Iām not certain thatās a database issue (even millions of rows is nothing for a database). If itās scanning for images (as in a comparison) it makes me think something is slow at the file system side.
Wanted to add that itās now at 86% 3 hours later. I went into the mysql directory and mysql.err had lots of entries like:
2019-06-24 18:57:29 0 [Warning] InnoDB: Table mysql/innodb_index_stats has length mismatch in the column name table_name. Please run mysql_upgrade
as well as:
2019-06-24 20:01:12 10 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
Although that last entry is 3 hours ago.
The only two files in .mysql.digikam/db_data with a current timestamp are: ibdata1 and ib_logfile0. Everything else has a date from today, but an older time.
In
.mysql.digikam/db_data/digikam - the latest entries:
-rw-rw----. 1 ermesa 37748736 Jun 24 23:43 ImageMetadata.ibd
-rw-rw----. 1 ermesa 17825792 Jun 24 23:43 ImageComments.ibd
-rw-rw----. 1 ermesa 573440 Jun 24 23:43 Tags.ibd
-rw-rw----. 1 ermesa 75497472 Jun 24 23:43 ImageTags.ibd
-rw-rw----. 1 ermesa 104857600 Jun 24 23:43 Images.ibd
-rw-rw----. 1 ermesa 31457280 Jun 24 23:43 ImageInformation.ibd
-rw-rw----. 1 ermesa 629145600 Jun 24 23:43 ImageCopyright.ibd
Not sure if that helps with figuring out what went wrong, butā¦
Well, either I suppose. But in light of your message about possible corruption that seems far less likely now! Missing log file is basically missing dataā¦
Edit: but that doesnāt rule out file system corruption either
Looks like the first warning could just be about version problems (perhaps it happened before the upgrade was complete). The second error is a bit more worrying though; searches seem to suggest perhaps files were moved while it was running, or corruption. I hope you have backups!
ā¦elaborating a bit, about darktable having its own database. I have terabytes of raw images. I do a lot of focus stacking so i have an order of magnitude more than most. One issue is I donāt want the slices in any database, only stacked images, prior to processing (stack RAWs into a tif, then edit that, then, finally, last step, export a jpeg). Having the tifs and jpegs in a database makes sense but not the raw slices.
Using a database for attaching keywords and rating values on a 0 - 10 scale is useful to Darktable but itās useless and not available to other softwares.
If you use a well-thought-out file system way of storing and labeling images and/or image groups, then your efforts can be used by any and all software. Not just darktable.
More likely files being moved than corruption. Itās not perfect, but Iāve got btrfs RAID-1 with a scrub going to protect against corruption/bit rot.
Donāt know if Giles still hangs around here, but Iām still curious if I should go back to sqlite. At at any rate, in the long-term I should just move the database off of btrfs which is horrible for databases. Itās just that at this moment that would require throwing another drive in there.
/home/sandy/Camera/Birds/ is a main repository
/home/sandy/Camera/Insects might be another
/home/sandy/Camera/Birds/Costa_Rica/Savegre_Lodge/2011-02-16/raw1
might be a place to empty raw images off a card.
Inside raw1 might be the following: raw1/DSC1234.NEF ā¦perhaps this was a toucan photo
raw1/jpegs/DSC1234_Toucan-in-avocado-tree.jpg ā¦now the jpeg can be matched to the raw it came from, usually one directory up
raw1/jpegs/web/up-DSC1234_Toucan-avocado-tree.jpg ā¦700 pixel versions generated with imagemagick go in the web directory nested inside jpegs.
Perhaps I want to add some search keywords to the date directory below Savegre-Lodge:
cd /home/sandy/Camera/Birds/Costa_Rica/Savegre_Lodge/2011-02-16
touch overcast
touch dawn
===========
sudo updatedb
Lets say I know I have some toucan photos from Savegre-Lodge but I canāt remember the exact date, but I do remember is was overcast at dawn (touch keywords I frequently use are about time of day and weather conditions)
locate -i Costa | grep overcast | grep -i savegre | grep -i dawn | grep -i toucan
Boom. Iāve got it. Most important the relevant files are on the file system and so is the meta data, so I can now send the right files to gimp or to RawTherapee or to Darktable. The search process system isnāt useful ONLY to darktable.
And it doesnāt take three days to import 4 terabytes of raw files tifs and jpegs into darktableās database.
Iām not saying my way is better than anything. Itās what works best for me. I am an outlier.ā¦I admit
Regard your comment here, can you share thoughts on how to structure such a sophisticated file system using just a file explorer or similar and not go through the process of an app like Digikam or SQL database structures? I also am retired and from over the years I have accumulated about a half-million items collection of images and videos. I just want to be able to find something simply and quickly without having to go through databases like SQL or whatever. Also, Iāve never been a programmer so donāt know how to get involved with coding type details, nor do I want to at this stage in life.
I notice that Microsoft Windows 11 File Explorer accommodates some degree of metadata like for keywords, comments, etc., (Alt+Enter on a media item or group of items) but that of course has limitations like if I want to change the creation date for film era scanned image from the 1970s, or if I want to apply keywords or short description to video clips. Thanks.
I might add, Iāve been organizing data thus far in chronological sequence of year, 12 month sub folders, 31 days of further sub folders with file names that follow a naming convention of YYYY-MMDD @ HHMMSS-###. But then to subsequently find something that does not work. Usually when searching for media in order of more or less relative sequential important the 5 āWā principle applies:
Who What Where Why When.
Iāve been making āWā #5 the priority for organization when that is the least important, or at least less importantā¦
Leaving aside that thatās a post from over 2 years ago, @pittendrighās argument that heās not useing an application-specfic database is a red herring. updatedb does exactly what it says: it UPDATEs a DataBase; one thatās specific to the locate command. Also, his solution is for Linux and other unix-like OSes only (as far as I know) , command-line based, and rather manual:
the user has to update the database explicitly when the data changes
the command to interrogate the database is rather sophisticated, and he only gives an exemple of an AND search (OR and NOT are not all that difficult: e.g. grep -iE "savegre|dawn" but add extra complexicity to the command set to master).
There is nothing particularly sophisticated about his file structure, from what he posted.
Not really the kind of system I would like to work with (and I am using Linux, and I am familiar with the grep command and such). but if it works for him, fine.
Then again, if you donāt want to use such a system, nor a program like Digikam or an SQL database, youāre running out of options fastā¦
Iām using both Digikam and darktable, with keywords organised in trees, which follow more or less your 5 "W"s (one tree per āWā). Once mastered, the searching is fast and easy. And I cannot forget to update the database when adding information.
A few other point:
if ever you realise your tagging structure needs reorganising (e.g. add a level to a tree), thatās going to be fun with a system based on file-namesā¦
how do you even create multiple tag trees with a directory-based system? Hard links are an option, but afaik not possible under MS-Windows, but how do you maintain such a system of links?