Hierarchical keywords

I’ve got “detritus” from literally 4 DAM’s in my image keywording.

Many, many moons ago, I started down this road with iMatch, which was interesting. After flailing around a bit, I exported all my keywords to xmp files and adopted iView which later became iView Media under Microsoft, and then Media Pro under Phase One.

And finally all this got migrated to LightRoom about 8 years ago.

I was overly ambitious with my keywording and have way too many, and honestly I stopped applying keywords about the time I adopted LR.

My copy of LR is getting a bit long in the tooth, so I’ve installed Digikam on a linux box and it’s currently reading all the data associated with some 70k images on my NAS box.

It’s about 25% complete and in looking at the keywords in the Captions->Tags viewers, this is what I’ve got:


Some of the keywords are displayed as intended and some are just strung together with periods separating the terms.

Any idea why?


There are several ways to store keywords in XMP, and they don’t all use the same separators to indicate a tag hierarchy. Perhaps have a look at a few sidecar files in a text editor? (They are plain text, so easy to read. Perhaps less easy to understand…)

And perhaps one of the programs you used initially didn’t have hierarchical keywords, and you separated them with dots to simulate a hierarchy? See e.g. the “People.Family.Verda” tags you show: that looks to me as a tag for pictures with 2 or 3 people in them: “Sherry”, “Chris” and sometimes others…


I guess I never paid much attention before. This situation existed in LR. It took some time, but I’ve got everything behaving much better now.



I’m hitting my head against a brick wall.

This has to be detritus from other tag management, but I’m loosing my mind over this.

I’ve reconfigured the tags in a hierarchical tree all pretty and fine using the Tag tools in DK.

Once I write all the tags to the files, a number of tags “reappears” automagically.

8 of them to be precise. Selecting tags and then remove the tags from the files works, for about 30 seconds and then a bevy of more of them arrive.

Deleting the tags exhibits a similar behavoir.



And the answer is …


I had almost forgotten about that piece of fecal matter.

Once upon a time I made the mistake of buying a cheap ACDSee DAM product. About 2 weeks after I bought it, ACDSee deprecated it with no upgrade path.

Luckily it was within my return window, so I did exactly that.

That was several years ago.

Via DK it has become apparent that this piece of detritus inserted a plethora of junk into the xmp files, for instance:


It also stuck a subject field:


I’m going to try and clean the crap out via exiftool and then see where this goes.

Edit: this forum software certainly doesn’t like carriage returns…

Use the preformatted text option (</>) for code and such, where line breaks and other whitespace is important.

Never had any problems with them… If you want to insert a ‘<’, precede it with a backslash (‘\’),
and of course, as @Donatzsky said, use the proper option for code.

Concerning the acdsee tags: digikam can read and write those, and quite a few others. I have sidecars with the “acdsee:categories” (as attribute), also some “dc:”, “mediapro:” and “lr:” tags and attributes (and I know i never used any of those three programs!). Check out xml namespaces (the lines starting with “xmlns:”).

Well, exiftool to the rescue. Threw this command at the lot:

e:\documents\photography\software\exiftool\exiftool -XMP-photoshop:EmbeddedXMPDigest= -XMP-photoshop:LegacyIPTCDigest= -XMP-digiKam:TagsList= -XMP-digiKam:ColorLabel= -XMP-digiKam:PickLabel= -XMP-mediapro:CatalogSets= -xmp-microsoft:all= -HierarchicalSubject-=Technical -HierarchicalSubject-=Unaltered -HierarchicalSubject-=Workflow -HierarchicalSubject-=CampOuts -HierarchicalSubject-="Clay's" -HierarchicalSubject-=House -HierarchicalSubject-="National Mall" -HierarchicalSubject-="Photographic Testing" -HierarchicalSubject-="Post processed" -HierarchicalSubject-=Rapids -HierarchicalSubject-=Rated -HierarchicalSubject-=Tennis -HierarchicalSubject-="Things of Interest" -HierarchicalSubject-=Uncategorized -HierarchicalSubject-=Location -HierarchicalSubject-="2 Stars" -ext xmp -ext jpg -r y:\

Took about six hours, but everything is cleaned up.

Doing some looking around, Capture One had grabbed some of this malformed junkola and written in their sidecar .cos files. Sigh.

Exiftool will read .cos file but not write them. Sigh, again.

So, whipping out the (t)rusty python, I put this together:

import os

# folder path
path = r"Y:\"

for path,subdir,files in os.walk(path):

    for name in files:
        if str(name).endswith('.cos'):
            filename = os.path.join(path, name)
            oldfile = filename + '.old' 
            print("oldfile  = ", oldfile)
            print("filename = ", filename)
            # rename existing .cos to .cos.old
            os.rename(filename, oldfile)
            # open old file for reading
            f1 = open(oldfile, "r+")
            # create replacement cos file for writing to
            f2 = open(filename, "w+")
            for line in f1:
                # read old file line by line
                if 'Content_Keywords' in line:
                    line = '			<E K="Content_Keywords" V="" />' + '\r'
                if 'Content_SupplementalCategories' in line:
                    line = '			<E K="Content_SupplementalCategories" V="" />' + '\r'
                # write old file line by line
            # close files when done

Sorry about the formatting. I’m not smart enough to figure out how to make it behave.

It’s about done cleaning out that foo foo.

This has been a pain in the ear and is a lesson to be careful with exif data.

Tried to format for you. Thanks for sharing!

Thank you!