Descriptive metadata: Workflow, philosophy, technology using free software

I’m really struggling with metadata handling using free software. My time spend on metadata is close to my time spent processing the image even when it’s a one time thing. But due to the metadata handling of our main software it tends to get a lot more complicated.

I’m going to use the word “tag” and “tagging” for all the descriptive metadata and the process of managing it.

Tags are important to me because they organise my data, describe the photograph when published to the web and contain important information to other people who use my photographs. Managing this across multiple fileformats, destinations and processing alternatives is hard work.

I’d be interested if developers and users could respond to my numbered list below. Any comments thoughts or tips are appreciated!

  1. Processing data from raw software is conceptually different from tags and if my understanding is correct not part of the xmp standard schemas but considered custom data? Processing data is largely not portable between applications and therefore not general in the same way as xmp schema data.
  2. All files with the same parent file should contain the same tags. This is given because the content will not change with format, size or processing differences.
    a. From this follows that it’s the raw file you want to tag as all children will (should) inherit the tags
    b. The process is not always linear as you might export several files only to realise you need to add additional tags. This use case quickly becomes complicated. Treating a family as one object would be useful or a good workflow for keeping track of family relationships.
  3. Because the data is subject not file dependent xmp files should be associated to the basename not the extension. This is the standard outside foss. see this thread. The arguments against are, as far as i understand, based on treating the processing settings a primary instead of the descriptive tags of xmp files.

Digikam is the most fully featured DAM and solves some of the issues above. It can group files by filename to simplify tagging and use the basename convention for xmp files. Rawtherapee doesn’t read xmp files at all and is thus blind to tagging. You have to manually tag the files after export. darktable uses filename.extension.xmp resulting in multiple xmp files per raw file if other software is used. It complicates copying tags using exiftool, exiv2 cli tools.

I’d love to see discussions around meta data as involved as those about colour management. (ok perhaps a bit less involved) Come on opinionated tagging afficinados tell me where I’m wrong and how you see descriptive metadata practically and conceptually.

note: targeted use of cli tools can handle quite a few of the issues but are always quite the break from the workflow. Speaking as someone who doesn’t use a desktop environment and does most work from a terminal it’s still a hassle to find the corresponding files and run the correct commands on them.

I struggled a bit with this in the context of writing software. I finally came to these realizations, pertinent to no other situation than my own:

  1. I don’t like sidecars, so I want all the processing it took to get from raw to output contained somewhere in the output image’s metadata. One of the most useful things I implemented in my software is the ability to “open source” an output image, and the source raw is opened and the processing from the output image is applied. My current workflow is very dependent on this .

  2. I keep all my raws, well, except for the truly bad ones. I decided after a bit of frustration with the EXIF/XML/IPTC/Makernote morass to just carry EXIF into output images, mainly for reference to the date/time, f-stop, shutter speed, ISO, and other tags relevant to how the image was shot, and leave the grand collection of tags to the original raw file. Oh, and my output images include the processing chain, per #1.

None of the above considers image categorization, so no help from me there. Anyway, FWIW…

I know almost nothing about image metadata. But I can tell you even “simple” relational data transfer between applications is hard, hard enough that it’s my entire job. Sometimes it simply won’t “fit”, depending on how each program holds the data. And graph based data… forget it! The best you can hope for is that applications stick stubbornly to standards (and that’s not as common as I’d like).

The tags are standardised. Admittedly there are a few standards. Exif, xmp, dublin core, iptc etc. but the handling of those are quite well understood. There are libraries and tools that manage them quite well and there are, from a user perspective, very few problems with reading and writing tags in the software that support them. The problems are where the tags are stored and how to manage lots of files with lots of tags between multiple software.

The processing tags however are lost to other software which is fine. I’d probably prefer a separate sidecar for software settings like the pp3 files of Rawtherapee. Keeping the xmp’s for cross application metadata.

Edit:
Geeqie is actually quite good and can group files to apply tags. You can also configure it to use basename.xmp or filename.ext.xmp.

It’s however quite difficult to handle tagging when the photographs require various mixes of tags. You can’t, as far as i know, apply only one tag to all selected images. You either have tag one image at a time or apply the same tags to all selected images.

Scratch the above! Recompiled Geeqie and now you can “add selected keywords to selected files”! That and the new Keyword autocomplete feature makes geeqie the best viewer and tagger!