What's the best type of database nowadays?

Hey Guys,

Things are very slow for me on the database front. Latest digikam update from Fedora (6.1.0) updated the database schema and has been going for hours just to get to 86% for finding new items. (Earlier today it would keep getting stuck at 83%). I’m using internal mysql and right now I have 2 mysqld processes currently running (any way to figure out which is related to digikam?) taking 8-ish% CPU. Earlier today I ran digikam and it kept getting stuck at 83%, it was complaining about ancient files (like at least a decade old) that it should have scanned in the past. I added about 214 Canon RAW files, but that’s never made things this slow in the past.

For a while now the program has been unresponsive whenever writing metadata to files - like for minutes - I usually have a book open while I’m adding metadata.

I switched to mysql because I have hundreds of thousands of photos and videos and I thought that was supposed to be the faster format. Currently database is on spinning HD, not SSD, but I don’t want to try to move it at the very least until after it’s done finding new items. Should i still be using MySQL or should it be SQlite?

If it’s not a client-server architecture, SQLite is generally the best.

But anything that writes to disk frequently will have a very low operations-per-second limit on a hard drive.

My reaction is an outlier. But I rely on the file system for storing finding and labeling image groupings. I’m a retired programmer who specialized in database stuff: Oracle, Mysql, DB2 and Postgres and even XML databases like SleepyCat and the open source Exist. Actually I did a lot with XML databases. So my database reluctance has noting to do with skills.

I don’t want my image processing life cycle to involve a database any more than I want to drive an 18 wheeler to the corner store for milk and beer.

I use the darkroom half of Darktable religiously. With the lighttable half of Darktable I only use the export button. There are sophisticated file system directory structure ways of storing images, that are a lot more accessible and easier to navigate than embedded database software.

4 Likes

I don’t use digikam, but I’m not certain that’s a database issue (even millions of rows is nothing for a database). If it’s scanning for images (as in a comparison) it makes me think something is slow at the file system side.

Wanted to add that it’s now at 86% 3 hours later. I went into the mysql directory and mysql.err had lots of entries like:

2019-06-24 18:57:29 0 [Warning] InnoDB: Table mysql/innodb_index_stats has length mismatch in the column name table_name. Please run mysql_upgrade

as well as:
2019-06-24 20:01:12 10 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.

Although that last entry is 3 hours ago.

The only two files in .mysql.digikam/db_data with a current timestamp are: ibdata1 and ib_logfile0. Everything else has a date from today, but an older time.

In
.mysql.digikam/db_data/digikam - the latest entries:

-rw-rw----. 1 ermesa 37748736 Jun 24 23:43 ImageMetadata.ibd
-rw-rw----. 1 ermesa 17825792 Jun 24 23:43 ImageComments.ibd
-rw-rw----. 1 ermesa 573440 Jun 24 23:43 Tags.ibd
-rw-rw----. 1 ermesa 75497472 Jun 24 23:43 ImageTags.ibd
-rw-rw----. 1 ermesa 104857600 Jun 24 23:43 Images.ibd
-rw-rw----. 1 ermesa 31457280 Jun 24 23:43 ImageInformation.ibd
-rw-rw----. 1 ermesa 629145600 Jun 24 23:43 ImageCopyright.ibd

Not sure if that helps with figuring out what went wrong, but…

Meaning the file system the database resides on?

Well, either I suppose. But in light of your message about possible corruption that seems far less likely now! Missing log file is basically missing data…

Edit: but that doesn’t rule out file system corruption either

Looks like the first warning could just be about version problems (perhaps it happened before the upgrade was complete). The second error is a bit more worrying though; searches seem to suggest perhaps files were moved while it was running, or corruption. I hope you have backups!

Have you tried the AppImage?

…elaborating a bit, about darktable having its own database. I have terabytes of raw images. I do a lot of focus stacking so i have an order of magnitude more than most. One issue is I don’t want the slices in any database, only stacked images, prior to processing (stack RAWs into a tif, then edit that, then, finally, last step, export a jpeg). Having the tifs and jpegs in a database makes sense but not the raw slices.

Using a database for attaching keywords and rating values on a 0 - 10 scale is useful to Darktable but it’s useless and not available to other softwares.

If you use a well-thought-out file system way of storing and labeling images and/or image groups, then your efforts can be used by any and all software. Not just darktable.

Why not start a post/guide about it? Seems like you had some interest already (and I suppose others use filesystem database) :slight_smile:

More likely files being moved than corruption. It’s not perfect, but I’ve got btrfs RAID-1 with a scrub going to protect against corruption/bit rot.

Don’t know if Giles still hangs around here, but I’m still curious if I should go back to sqlite. At at any rate, in the long-term I should just move the database off of btrfs which is horrible for databases. It’s just that at this moment that would require throwing another drive in there.

2 Likes

I posted this once before.

/home/sandy/Camera/Birds/ is a main repository
/home/sandy/Camera/Insects might be another

/home/sandy/Camera/Birds/Costa_Rica/Savegre_Lodge/2011-02-16/raw1
might be a place to empty raw images off a card.

Inside raw1 might be the following:
raw1/DSC1234.NEF …perhaps this was a toucan photo

raw1/jpegs/DSC1234_Toucan-in-avocado-tree.jpg …now the jpeg can be matched to the raw it came from, usually one directory up

raw1/jpegs/web/up-DSC1234_Toucan-avocado-tree.jpg …700 pixel versions generated with imagemagick go in the web directory nested inside jpegs.

Perhaps I want to add some search keywords to the date directory below Savegre-Lodge:

cd /home/sandy/Camera/Birds/Costa_Rica/Savegre_Lodge/2011-02-16

touch overcast

touch dawn

===========
sudo updatedb

Lets say I know I have some toucan photos from Savegre-Lodge but I can’t remember the exact date, but I do remember is was overcast at dawn (touch keywords I frequently use are about time of day and weather conditions)

locate -i Costa | grep overcast | grep -i savegre | grep -i dawn | grep -i toucan

Boom. I’ve got it. Most important the relevant files are on the file system and so is the meta data, so I can now send the right files to gimp or to RawTherapee or to Darktable. The search process system isn’t useful ONLY to darktable.

And it doesn’t take three days to import 4 terabytes of raw files tifs and jpegs into darktable’s database.

I’m not saying my way is better than anything. It’s what works best for me.
I am an outlier.…I admit

3 Likes