DAM strategy, xmp embedded vs. sidecar, incremental backups

Hello everyone,

I am struggling to formulate a specific question, so I thought I would just share my thoughts.
I would very much appreciate feedback if there is some aspect that I am not considering properly.

I am setting up a DAM workflow and am trying to decide how to handle xmp data for jpgs.
I don’t want to alter sooc-jpgs (which include jpgs I was sent by friends/family), so I see two options:

A) sooc jpg + xmp sidecar, raw + xmp sidecar, derived jpg
B) sooc jpg + xmp sidecar, raw + xmp sidecar, derived jpg + xmp sidecar

I expect image tags to keep changing over the years e.g. as I refine my keyword hierarchy.

What bugs me about the more commonly used approach A is that a jpg file gets re-written whenever its xmp metadata is altered, meaning it has to be backed up as well.

I make incremental backups with Duplicati, so those modifications would not be such a big deal if the metadata were located at the end of the otherwise unaltered jpg file, but if I understand the specifications correctly metadata are located at the beginning of the jpg file.

So I think I will go for B.

The best and most correct way is to have one xmp file for all formats and write to that. This means maintaining the basename for all three files. This becomes a bit complicated/not optimal for the sooc vs derived jpegs but my camera saves the extension as capitals and I export with lower case.

Image.JPG
Image.RAW
Image.jpg
Image.xmp

You need to look out though because the support for the xmp standard is pretty poor in free software.

Which SW do you use?

How do you view your files?? If you have jxl support maybe you can export to that. I wonder if being newer it handles the meta data issue a better way… you can always generate a jpg from it if you need to and most browsers and many apps can now view these… I am still trying to wrap my head around all the capabilities… Others might have far more knowledge and experience…

Mh interesting idea to use different extensions for sooc and derived jpgs. I wouldn’t want to rely on case-sensitivity though, because not all OS are actually case sensitive. This could lead into trouble down the road. I could instead use .jpeg for the sooc jpg files, but I don’t like that either.

I have also read parts of the debate over filename.ext.xmp vs filename.ext somewhere else here on pixls, and I must confess I do actually prefer the double-extension convention.

If you use exiftool via CLI, you can apply metadata to the jpg while also keeping the original. If you add -overwrite_original then it will not keep the original.

I have started to use digikam for browsing the photos. I am testing its capabilities on a subset of recent photos but I plan to import my entire collection once my workflow is set.

I use darktable for raw conversion and I have been using it to tag photos, too.
I used to use Capture One, so there are plenty of raw files processed with that software, too.

I currently use a folder hierarchy like the following and am testing workflow B with sidecar files for jpgs.
After processing the raw file with darktable (with metadata export disabled) I may get:

./2023/jpg_sooc/basename_sooc.jpg ... without ratings and custom xmp metadata
./2023/raw/basename.RAF
./2023/raw/basename.RAF.xmp ......... has ratings and custom xmp metadata
./2023/raw/basename_01.RAF.xmp ...... has ratings and custom xmp metadata
./2023/jpg/basename.jpg ............. without ratings and custom xmp metadata
./2023/jpg/basename_01.jpg .......... without ratings and custom xmp metadata
./2023/jpg/basename_02.jpg .......... without ratings and custom xmp metadata

(I have written a small python-based program to import photos from my camera’s SD cards. It copies the files, compares checksums to make sure everything is copied properly, and does some renaming, to make sure that image filenames don’t repeat after 9999 photos taken. That’s also how the sooc-jpg files get their filename suffix.)

Now, importing all the images into digikam I am faced with the issue that the jpg files lack tags.
So I have written another small python-based program which scans folders for images and sidecar files that share a basename (like all the files in my example). This program then synchronizes tags based on the information that was written most recently. I am instructing that program to only ever write xmp files, also for jpgs, using exiftool under the hood.
So this tool will then add more sidecar files to the tree:

./2023/jpg_sooc/basename_sooc.jpg ... without ratings and custom xmp metadata
./2023/jpg_sooc/basename_sooc.jpg.xmp
./2023/raw/basename.RAF
./2023/raw/basename.RAF.xmp
./2023/raw/basename_01.RAF.xmp
./2023/jpg/basename.jpg ............. without ratings and custom xmp metadata
./2023/jpg/basename.jpg.xmp
./2023/jpg/basename_01.jpg .......... without ratings and custom xmp metadata
./2023/jpg/basename_01.jpg.xmp
./2023/jpg/basename_02.jpg .......... without ratings and custom xmp metadata
./2023/jpg/basename_02.jpg.xmp

(Well, I will mostly have only a single xmp, and a single derived jpg per raw file.)

From here I am going to do all remaining tagging in digikam and I would bulk-tag all images that belong together, so these changes will be made to all the xmp sidecar files simultaneously without the need for additional synchronization.

Does the software you use allow an export as jpeg rather than jpg?