Free software and photo metadata — a chance to engage with the broader photography industry

I have just had a long conversation with an IPTC member about photographic metadata and the free software community. The starting point for the conversation was what could be done to get industry standards better embedded into free photography software.

The great news: there is the possibility for our community to engage with the photographic metadata standards community so that (1) we can learn from their considerable expertise and experience and (2) they might learn how to better communicate what they do to people in our community.

My goal in all of this is very simple: to have free photography software be fully compliant with industry standards.

Whether you are a user or a developer, this is why you should care:

  1. Your images are found by someone on Google image search. If your photo contains industry standard metadata, Google displays your contact information and the photo’s copyright information so the person who found it knows how to contact you because they want to include your photo on their blog or wherever.
  2. You upload your images to a service like Flickr and you like it that the keywords and location info automatically appear.
  3. You decide 10 years from now you’d no longer like to use software XYZ but something else instead and you don’t want to lose any (or even all!) of the descriptive metadata you’ve entered. You want it to seamlessly transfer, without losing any data. Think location information, keywords, copyright, description, etc.
  4. Thirty years from now someone comes across your digital image somewhere and wants to know who the photographer was, where it was taken, etc. It might be a stranger online, or your own family member. They can’t read the back of the photo, right? They need access to metadata.

To do any of this you need your software to follow industry standards — standards that reflect decades of experience by industry veterans from photography, information science, news organizations, etc. For these things to work seamlessly, software on your desktop or in the cloud must follow standards. Otherwise you lose some or all of your metadata. You might not even notice you’ve lost it — badly written programs can silently remove photo metadata fields they’re unaware of when photos are modified by them (ouch!).

Sometimes it is tempting for users to not follow the standards because they don’t understand how to use them, or why. Developers may not understand why things are done in a certain way and not others, perhaps because the documentation is opaque or uses terms that once made perfect sense to their audience but no longer do (e.g. the distinction between headline and title).

We are very fortunate in the photography community that the industry standard was not first initiated by a proprietary software company, but by people who need to use the metadata to be able to work effectively with each other. The same cannot be said in the video industry, whose metadata standards are a vendor-driven mess compared to photography.

Fortunately for the entire industry, Adobe, Microsoft, Apple and other software companies big and small collaborated and hashed standards out. They made their software work with each other instead of against each other. There is consequently no problem of vendor lock-in. Thank goodness!

That doesn’t mean all software from these companies from the last 10 years or so always does the right thing all the time. They don’t, especially older software. But things are better now than they’ve ever been. And regarding FOSS tools, we have splendid support from ExifTool and Exiv2.

You’ve seen the terms XMP, IPTC etc.

  1. XMP is the contemporary standard for storing metadata
  2. IPTC governs what metadata is actually stored, e.g. location, keywords, copyright info, etc.

A very handy Japanese bento box diagram how it all fits in with other things like Exif: Types of Metadata | Photometadata.org

An example of a proprietary software program that takes a nonstandard approach to photo metadata and consequently is really a bit of a self-inflicted mess: Corel Aftershot.

By contrast, an example of a proprietary software program that works really well with industry standards: Photo Mechanic. This is not by accident. Photo Mechanic developers have a history of engaging with and improving industry standards. :heart_eyes:

With respect to photography metadata, our goal should be to be more like Photo Mechanic, and absolutely not like Corel Aftershot (which I would classify these days as a failed application that no one should invest any time or money in). :face_vomiting:

Realistically speaking, when it comes to photo metadata, let’s be honest, some important programs in our community are most definitely not like Photo Mechanic, and are more like Aftershot. :face_with_raised_eyebrow:

On the other hand some FOSS programs have done great work to be compliant: Photo Metadata Software Support - IPTC

If you are interesting in engaging with the photographic metadata standards community please share your interest and/or thoughts below. You don’t have to be a developer. If the developer of your favorite software program is not interested in following the industry standards, but you are, you can educate and pressure them to do the right thing.

If you have a specific technical or aesthetic problem the XMP and IPTC standards, then share your thoughts too. That’s useful to know. Or link to them if you have already expressed your thoughts online in some manner.

Thanks for reading and apologies for the length.

8 Likes

Are you mixing correlation and causation?

Is the use of the file.ext.xmp nonstandard convention the reason Corel Aftershot is, as you call it, a “self-inflicted mess” that “no one should invest any time or money in”?

I’m saying that an apparently dying (if not already practically dead) piece of software is in this instance an example of a software program that does a poor job of implementing the XMP spec. The reasons for the software’s overall failure is probably not unrelated to the reasons why they were unable or unwilling to implement the spec. Maybe they just were just not up to it? Or maybe they just wanted to cash in after years of hard work, move on to other things in life, and let Corel ease the product into its inevitable death? I don’t have any insider info there.

From my conversation with the IPTC rep, he was not aware of any program on Windows apart from Aftershot that implements the file.ext.xmp approach. Every other program he is aware of uses the file.xmp approach as specified in the standard. On Windows and macOS that’s how it’s done in practice, and that’s how it’s meant to be done in the standard. (We didn’t talk about darktable during that part of the conversation.)

I simply don’t understand why the Standard requires to exclude the extension for the sidecar xmp file.

If I have two files in the same folder 08154711.NEF and 08154711.jpg according to the Standard they would share the same xmp sidecar, though being different files…

3 Likes

In practice the JPEG (or TIFF) would embed the XMP file inside its header.

I discussed this concern earlier today with the IPTC rep (since it keeps on coming up here in our community, repeatedly). Hopefully he or another standards body person will come share their thoughts with our community directly, and anyone here can engage them with questions. I will suggest they point to the relevant parts of the spec that proscribe what to do in the case of 08154711.NEF and 08154711.jpg where for whatever strange reason there is a sidecar genuinely meant to be associated with both.

If you are worried about 08154711.NEF and 08154711.CR2, then that’s a sign of a fundamentally broken file naming scheme. What file would 08154711.JPG or 08154711.TIF be associated with or derived from?

In practice I’m more worried about 08154711.dng and 08154711.RAF because of X-Transformer.

Adobe would put the metadata directly in the dng and sidecar the RAF, but I refuse to do that with the dng.

1 Like

As far as I can tell, only Lightroom implements this so-called standard. Literally no other raw developer uses file.xmp.

  • Darktable uses file.ext.xmp
  • Exposure used file.ext.exposurex5
  • ON1 uses file.on1
  • RawTherapee uses file.ext.pp3
  • Silkypix used file.ext.10.spd
  • Capture One, Luminary, ACDSee, Photo Ninja do not use sidecar files to save edits.

I call bullshit on this one. You can only call it a standard if usage is standardized.

3 Likes

An old favourite is back. I’m in favour of extension exclusion. If you have a jpg, dng, pef next to each other they should have the same xmp data. They are just different views of the same content. I think its fine to have software setttings in a separate file ie. pp3 but I would love my tagging camera settings etc in a standards compliant xmp file.

I might be wrong here but it seems many on pixls are mainly concerned about the edit metadata. Im my view thats at best secondary. The other metadata is forever and hard work, the edits may have to change for a new media or context.

Free software is often good at following standards, the sidestepping of such critical interoperability is a bit of a shame.

Its the tagging that matters. Xmp is part of the whole XML thing. Content, style and metadata separated. Not being an expert, image metadata does seem to be an area actually well suited to XML.

To me it clearly breaks the whole concept to include the extension but programmers don’t see it. I find it strange but I think its due to looking at it very differently. Focussing on edits rather than other metadata.

continued:

xmp is part of the massive metadata effort spanning decades. Dublin core, xhtml, rss, etc. image metadata just has to fit this massive puzzle. It was over done in the 2000’s with people using xml for simple config files etc but for actual metadata it seems quite good to me, but I know very little.

I guess the file naming thing comes from the uri idea. Where for example a content negotiating web server might serve up different content at the same uri based on the needs of the viewer. Jpeg, png, pdf, text etc. in the same way /home/user/img/2020/0203-example/IMGP2020 is usefully viewed as a uri with IMGP2020.jpg IMGP2020.dng IMGP2020.tiff being the same data with different “styles” applied. It will have the same f-stop, same people in it same content and same location. If your file management breaks this idea it’s pretty broken right? Even “normal” people wouldn’t do that other than by mistake.

Looking at LR for instance is missing the point. It’s better to consider Magnum, reuters or something. I don’t know if they use metadata properly but they probably do. Having your small time local metadata fit into this system is useful. Why do it another way when the benefits seem tiny at best.

Geeqie has the right idea! They support both filename.ext.xmp and filename.xmp but they clearly understand that the dng, xmp and jpeg should be considered one because Geeqie allows you to fold all those files into one thumbnail to be treated and tagged as one.

Photo mechanic is a proper DAM appreciated by photographers because it does things right and follows standards. Non DAM image software “just” has to make sure they store data in such a way it can be ingested by DAMS without issue and according to the rules of the huge metadata universe.

I am a bit dubious about the value proposition of XMP for edits. Unless all highlight sliders of all raw developers behave exactly the same way, there really is no point in saving the highlight slider position in a cross-application compatible format. And except for very few exceptions (exposure and perhaps white balance), behaviors differ greatly between raw developers. Thus, standardized XMP can not be used to transfer edits from one program to another.

As for file.ext.xmp vs file.xmp, I sometimes like to experiment with my camera’s rendition of a picture, and will edit both the OOC JPEG, and the RAW. In this case, file.ext.xmp is a necessity, as it would otherwise be impossible to separate these two edits.

The only thing where I see standardized XMP having some value is in encoding Metadata, such as tags, colors, and ratings.

4 Likes

Haha. I prefer extensions AND prefixes.

The thing is, this implies that the right thing to do if you have sidecars is to have two sidecars, one for editing parameters and an xmp for metadata and that’s just ridiculous.

But it seems that that might be what I do anyway for Filmulator; it’s simpler and more reliable for me from a database maintenance perspective to have the exact same schema for the main database and sidecar databases.

I don’t think anyone expects edits to transfer to other software. Or actually some do but that’s optimistic to say the least.

This is where perhaps it makes sense to write edits to xmp. Not for cross software purposes but to simply have one place for metadata including edits. I’m pretty sure there’s a software specific domain in the xmp so that you can have edits from 10 software in one file but each software will only tamper with it’s own namespace (may use wrong words here…)

Surely a subfolder or a renamed copy might do the job? Regardless the metadata should be the same no? Forget edits, the actual meaninful meta data about the file won’t change. IPTC as mentioned by damonlynch is

The IPTC is the global standards body of the news media. We provide the technical foundation for the news ecosystem.

The edits are a side issue but may well be best solved in xmp for tidyness reasons. personally I don’t mind separate pp3 and xmp files. Different software will produce xmp’s anyway. So a clean separation would be software specific extension for edits and read/write semantic image metadata to xmp. The .pp3 or equivalent could well be file specific but the metadata will apply to all children of the master raw file. Software like RT migh never write to xml if tagging is removed, but reading is important to propagate the metadata to the ouput file.

Coming back to file.ext.xmp vs file.xmp controversy

I was surprised that XMP specification could prescribe such a bad thought requirement as naming a xmp sidecar file.xmp.

So I went to some reading of the XMP specification Part 3 and in the introduction, the norm recognizes the need to write the XMP data in a sidecar file.

It says :slight_smile:

  • Use the file extension .xmp. For Mac OS, optionally set the file’s type to ‘TEXT’.
  • For applications that need to find external XMP files, look in the same directory for a file with the same name as the main document but with an .xmp extension. (This is called a sidecar XMP file.)

But a name and specifically a file name is a name used to uniquely identify a computer file stored in a file system

So, at least, a file name must include the extension, and I conclude that the name of a sidecar XMP for a “name.ext” file must be “name.ext.xmp”

3 Likes

but, but, but contrarily it makes no sense to have the same content tagged with different metadata. A jpeg, a tiff and a png from the same master recording should be tagged as a group. (except for tags specifying target media etc) Again IPTC and xmp is mainly about semantic metadata.

I’m sure there’s some fundamental difference in understanding what metadata is how it’s architected and why it’s useful behind this schasm! :slight_smile: Aren’t there any information architect types around this forum. Philosophers of xml?

Perhaps no one uses metadata? I suffer at the moment because it doesn’t work well. Half my post processing time is spent on metadata. :frowning: because it doesn’t propagate as it should. When you have 100k images that data begin to matter. (this is because I use RT, dt works well within the free software ecosystem if i remember correctly)

If you’d actually look at how that data is stored instead of just assuming, you would see that XMP allows private namespaces where data that is not meant to be interoperable can be stored. Lightroom uses the lr namespace and darkrable uses its own namespace.

But if you have name.xmp and name.DNG.pp3 and name.RAF.pp3 then you get to keep your edits for different raw variants differently.

Metadata like tags that get shared across all same-named input files should go in the name.xmp. But edit sidecars need to be associated with individual input files.

It’s not about stepping on other editors’ toes, it’s about stepping on your own toes.

1 Like

but i agree with bastibe on this one. even putting it into a shared file raises the expectation that other software can read it/interpret it. essentially i think we never gained anything from putting dt edits into xmp (maybe except embedding edit history in jpg). to the contrary, it has a few idiosyncratic ideas about minimum and maximum file size… and it did happen in the past that these files got overwritten by others and the dt edits were lost.

and also if one was to follow the jpg + raw have the same xmp argument… it only makes sense if there is no edit history in the xmp. i mean maybe you want to re-open the jpg and apply a watermark or post-sharpen even if it’s in fact the same image. i guess i’m saying i’m all for taking edit history out of xmp/ignoring xmp completely if you’re not interested in metadata.

1 Like

Well it is just bad form on the part of those other applications to just blast over a file that already has content.

I don’t believe anything says that you have to apply the whole xmp file to every file, so if you wanted to support this use case, the app could silently ignore edit history for non-raw files or prompt the user for what action to take.

hm. depending on other software to do the right thing is asking for trouble. no need to open that door imo.

not applying the xmp data selectively does not sound very well thought out to me. certainly not to the level of a standard. and if you ask me… also in my contrived example you need two xmp for two different files, because there would be different things inside. you can’t go ignore parts. if at all you’d need to store a block in xmp: do this if the extension was “jpg” and do that if the extension was “cr2” … which sounds even more backwards.

basically: what CarVac said above: