Hierarchical tag chaos

thank you great people for all the work done on dt3!
I’m especially happy with the improvements on the metadata/ tagging interface. I now feel a strong urge to drop rating and tagging in digikam and do everything in dt.

HOWEVER

after importing a stash of photos to dt to test, I found that I suddenly have a lot of duplicate tags on my photos.

for instance
Tags in digikam on one image

  • GEOGRAPHIC LOCATION|Europe|Central Europe|Belgium

Tags in darktable on same image as in digikam

  • GEOGRAPHIC LOCATION
  • GEOGRAPHIC LOCATION|Europe
  • GEOGRAPHIC LOCATION|Europe|Central Europe
  • GEOGRAPHIC LOCATION|Europe|Central Europe|Belgium
  • Europe
  • Belgium
  • Central Europe

It appears that darktable took apart the hierarchical tags that where applied in digikam and previously lightroom and now displays them as separate tags. The tags are shown in a similar way in the tag dictionary.

I tried to figure out if the EXIF/ IPTC are populated with single tags as well as hierarchical ones. Unfortunately my knowledge is too limited to say for sure, but it might be possible that there are both present.

Is there a way to tell darktable to only use/ display the hierarchical tags?
I would appreciate if I did not have to re-organise my tags and re-tag +50k images…

Any hint is appreciated.

As far as I remember,
if xmp.lr.hierarchical subject is present in xmp file, dt imports them and ignores xmp.dc.subject (single tags).
If xmp.lr.hierarchical subject is not present it imports the single tags
But it imports also itpc.keywords

There is no way to control the dt behavior here.
To make it work I would suggest to clean up your xmp files from any xmp.dc.subject and itpc.keywords , using exiv2 or exiftool and make sure the xmp.lr.hierarchical subject is present.

Then, deleting the useless tags is fast even if one has to do it one per one.

Removing the xmp.dc.subject (and itpc.keywords?) tags won’t help with the tags of form <t>|<t>|..., those have to come from xmp.lr.hierarchical or another field storing a hierarchy.

But in Digikam you can select which fields to write to sidecars, in settings->metadata->advanced.
That should allow you to at least get rid of the indivual tags. I don’t think you can get rid of the composite tags that are stored outside the tree that way, though.

Can you share one of these xmp files ?

Thank you for your comments,
attached a photo and both the xmp files of digikam and darktable.


20180808_Ostsee_0235.JPG.xmp (2.2 KB)
20180808_Ostsee_0235.xmp (9.4 KB)

If I understand properly the jpg.xmp file is the write xmp output of darktable, of an imported image jpg (?) image which has the digikam xmp companion file… Is that correct ?

I’m a bit lost. Can you describe the workflow you follow ? I want to be sure of what is the (clean) source and what is the unwanted effect ?

in the xmp file the hierarchical subject already contains single tags, those you don’t want to see in darktable, right ? …

   <lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>GEOGRAFISCHER STANDORT</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern|18374 Zingst</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN|Urlaub|2018 Ostsee</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern|18374 Zingst|Strandübergang 4a</rdf:li>
     <rdf:li>PRIVATE METADATEN</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN|Urlaub</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa</rdf:li>
     <rdf:li>PRIVATE METADATEN|FERIEN</rdf:li>
     <rdf:li>GEOGRAFISCHER STANDORT|Kontinent|Europa|Mitteleuropa|Deutschland|Mecklenburg-Vorpommern</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>

I have the same problem in dt 3.8. Is it now possible to filter the tags so that only the hierarchies are schown?

Same here. I would like to import all my old Lightroom pictures into dt. I have already exported the keyword file from lightroom and imported it in dt: the hierarchy is conserved. However, when I import the photos (with the associated xmp), the tag database is suddenly populated with all the de-hierarchized (if that’s a word) :wink: tags.

Note that my tag tree is quite big and it is cumbersome to delete and reassign tags (by the way, I haven’t found a way to delete more than one tag at a time in the tag tree… is it possible?)

thanks for any help
giuseppe

1 Like

Does the xmp file have the tags in the field lr:hierarchicalSubjec, in dc.subject, or both? For the first case it should work, for the second probably not, and for the third it depends of what’s saved on each. Can you upload an example?

1 Like

Hola,

thanks for picking this up! So, there seems to be 2 xml tags:

  1. dc:subject, which has the all components of a tag hierarchy listed as single tags

  2. lr:weightedFlatSubject, where only the last part of a tag hierarchy is listed as a single tag

I am attaching an .xml file. Please note that the original lightroom tags were:

~ATTRIBUTE | ~Number | group of people
~WHERE | North America | Mexico | Mexico City
~WHAT | ~Structures & Architecture | shop
~WHAT | ~Thoroughfare | street
~WHAT | ~Food | tortilla

Also note that some of the parent tags in the hierarchy (such as ~ATTRIBUTE, North America, ~Structures & Architecture, etc.) are set up in Lightroom not be exported in the xml (only their child tags will be exported.)

X1FA6412.xmp (7.4 KB)

cheers
giuseppe

Sorry, but I see nothing in your xml file that would allow darktable (or any other program) to reconstruct the tag tree. So all it can do is assign each tag or its own. (not even uppercase initial letters help, as you have geographical names as subtags).

As an aside: if you don’t export the parent tags as part of the tags for each image, you will end up with a much larger number of tag trees. Perhaps not what you want.

If possible, re-exporting the sidecars from lightroom with hierarchical tags is probably the easiest.
If not possible, you may have to correct the tagging “by hand”. Fastidious, but not impossible if you have the tag structure in place: select all images with tag “tortilla” and assign “Food|tortilla” to them, then remove the tag “tortilla”. At the end, remove tags with 0 (zero) images.

Thanks for the suggestion! I’ll try to go the re-exporting route then, and if that doesn’t work, I guess I’ll have to bite the 1000-pound bullet sooner or later…

I apologize, I hadn’t seen an exporting option in Lightroom which says “Write keywords as Lightroom hierarchy”. This actually writes the field lr:hierarchicalSubject in the .xmp file as @guille2306 said, e.g.:

lr:hierarchicalSubject
rdf:Bag
rdf:liMexico|Mexico City</rdf:li>
rdf:lishop</rdf:li>
rdf:listreet</rdf:li>
rdf:li~ATTRIBUTE|~Number|group of people</rdf:li>
rdf:li~Food|tortilla</rdf:li>
</rdf:Bag>
</lr:hierarchicalSubject>

So, I should be good to go. Thanks for your help!

giuseppe

Yes, this should work, but only if you manage to export the full hierarchy in the tags (including the not-for-export levels)

Yes, that’s what I figured. Thanks again!