I’ve got “detritus” from literally 4 DAM’s in my image keywording.
Many, many moons ago, I started down this road with iMatch, which was interesting. After flailing around a bit, I exported all my keywords to xmp files and adopted iView which later became iView Media under Microsoft, and then Media Pro under Phase One.
And finally all this got migrated to LightRoom about 8 years ago.
I was overly ambitious with my keywording and have way too many, and honestly I stopped applying keywords about the time I adopted LR.
My copy of LR is getting a bit long in the tooth, so I’ve installed Digikam on a linux box and it’s currently reading all the data associated with some 70k images on my NAS box.
It’s about 25% complete and in looking at the keywords in the Captions->Tags viewers, this is what I’ve got:
Some of the keywords are displayed as intended and some are just strung together with periods separating the terms.
There are several ways to store keywords in XMP, and they don’t all use the same separators to indicate a tag hierarchy. Perhaps have a look at a few sidecar files in a text editor? (They are plain text, so easy to read. Perhaps less easy to understand…)
And perhaps one of the programs you used initially didn’t have hierarchical keywords, and you separated them with dots to simulate a hierarchy? See e.g. the “People.Family.Verda” tags you show: that looks to me as a tag for pictures with 2 or 3 people in them: “Sherry”, “Chris” and sometimes others…
Never had any problems with them… If you want to insert a ‘<’, precede it with a backslash (‘\’),
and of course, as @Donatzsky said, use the proper option for code.
Concerning the acdsee tags: digikam can read and write those, and quite a few others. I have sidecars with the “acdsee:categories” (as attribute), also some “dc:”, “mediapro:” and “lr:” tags and attributes (and I know i never used any of those three programs!). Check out xml namespaces (the lines starting with “xmlns:”).
Took about six hours, but everything is cleaned up.
Doing some looking around, Capture One had grabbed some of this malformed junkola and written in their sidecar .cos files. Sigh.
Exiftool will read .cos file but not write them. Sigh, again.
So, whipping out the (t)rusty python, I put this together:
import os
# folder path
path = r"Y:\"
for path,subdir,files in os.walk(path):
for name in files:
if str(name).endswith('.cos'):
filename = os.path.join(path, name)
oldfile = filename + '.old'
print("oldfile = ", oldfile)
print("filename = ", filename)
# rename existing .cos to .cos.old
os.rename(filename, oldfile)
# open old file for reading
f1 = open(oldfile, "r+")
# create replacement cos file for writing to
f2 = open(filename, "w+")
for line in f1:
# read old file line by line
if 'Content_Keywords' in line:
line = ' <E K="Content_Keywords" V="" />' + '\r'
if 'Content_SupplementalCategories' in line:
line = ' <E K="Content_SupplementalCategories" V="" />' + '\r'
# write old file line by line
f2.write(line)
# close files when done
f1.close()
f2.close()
Sorry about the formatting. I’m not smart enough to figure out how to make it behave.
It’s about done cleaning out that foo foo.
This has been a pain in the ear and is a lesson to be careful with exif data.