Hierarchical tag chaos

thank you great people for all the work done on dt3!
I’m especially happy with the improvements on the metadata/ tagging interface. I now feel a strong urge to drop rating and tagging in digikam and do everything in dt.

HOWEVER

after importing a stash of photos to dt to test, I found that I suddenly have a lot of duplicate tags on my photos.

for instance
Tags in digikam on one image

  • GEOGRAPHIC LOCATION|Europe|Central Europe|Belgium

Tags in darktable on same image as in digikam

  • GEOGRAPHIC LOCATION
  • GEOGRAPHIC LOCATION|Europe
  • GEOGRAPHIC LOCATION|Europe|Central Europe
  • GEOGRAPHIC LOCATION|Europe|Central Europe|Belgium
  • Europe
  • Belgium
  • Central Europe

It appears that darktable took apart the hierarchical tags that where applied in digikam and previously lightroom and now displays them as separate tags. The tags are shown in a similar way in the tag dictionary.

I tried to figure out if the EXIF/ IPTC are populated with single tags as well as hierarchical ones. Unfortunately my knowledge is too limited to say for sure, but it might be possible that there are both present.

Is there a way to tell darktable to only use/ display the hierarchical tags?
I would appreciate if I did not have to re-organise my tags and re-tag +50k images…

Any hint is appreciated.

As far as I remember,
if xmp.lr.hierarchical subject is present in xmp file, dt imports them and ignores xmp.dc.subject (single tags).
If xmp.lr.hierarchical subject is not present it imports the single tags
But it imports also itpc.keywords

There is no way to control the dt behavior here.
To make it work I would suggest to clean up your xmp files from any xmp.dc.subject and itpc.keywords , using exiv2 or exiftool and make sure the xmp.lr.hierarchical subject is present.

Then, deleting the useless tags is fast even if one has to do it one per one.

Removing the xmp.dc.subject (and itpc.keywords?) tags won’t help with the tags of form <t>|<t>|..., those have to come from xmp.lr.hierarchical or another field storing a hierarchy.

But in Digikam you can select which fields to write to sidecars, in settings->metadata->advanced.
That should allow you to at least get rid of the indivual tags. I don’t think you can get rid of the composite tags that are stored outside the tree that way, though.

Can you share one of these xmp files ?

Thank you for your comments,
attached a photo and both the xmp files of digikam and darktable.


20180808_Ostsee_0235.JPG.xmp (2.2 KB)
20180808_Ostsee_0235.xmp (9.4 KB)

If I understand properly the jpg.xmp file is the write xmp output of darktable, of an imported image jpg (?) image which has the digikam xmp companion file… Is that correct ?

I’m a bit lost. Can you describe the workflow you follow ? I want to be sure of what is the (clean) source and what is the unwanted effect ?

in the xmp file the hierarchical subject already contains single tags, those you don’t want to see in darktable, right ? …

   <lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>GEOGRAFISCHER STANDORT</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern|18374 Zingst</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN|Urlaub|2018 Ostsee</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern|18374 Zingst|Strandübergang 4a</rdf:li>
     <rdf:li>PRIVATE METADATEN</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN|Urlaub</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>