Copy and import: bug? (+importing hierarchical tags from lightroom)

Hi all,

I stumbled on a strange behavior of dt. I am trying to move images from Lightroom to dt. I export them from Lightroom as “originals”, which basically copies the original raw file (e.g. myphoto.nef) into the chosen export directory and adds an XMP file (e.g., myphoto.xmp).

Now, if in dt I just use “Import/Add to library”, all the compatible development settings and the tags from lightroom are imported in dt (but more on the tags later). However, if I use “Import/Copy and import”, neither the development settings nor the tags get imported. Is this the intended behavior, or simply a bug? I tend to think the latter…

By the way, I am using dt 4 on a mac (binary from the website).

As a second issue, in every xmp file exported from lightroom, tags are encoded in three separate ways (follows example):

<dc:subject>
    <rdf:Bag>
     <rdf:li>ATTRIBUTE</rdf:li>
     <rdf:li>Action</rdf:li>
     <rdf:li>Bagnese</rdf:li>
     <rdf:li>Chinese lantern</rdf:li>
     <rdf:li>Emilia Romagna</rdf:li>
     <rdf:li>Europe</rdf:li>
     <rdf:li>Italia</rdf:li>
     <rdf:li>Italy</rdf:li>
     <rdf:li>Modena</rdf:li>
     <rdf:li>Natural Phenomenon</rdf:li>
     <rdf:li>Object</rdf:li>
     <rdf:li>WHAT</rdf:li>
     <rdf:li>WHERE</rdf:li>
     <rdf:li>flying</rdf:li>
     <rdf:li>fog</rdf:li>
     <rdf:li>weather</rdf:li>
    </rdf:Bag>
   </dc:subject>
<dc:subject>
    <rdf:Bag>
     <rdf:li>ATTRIBUTE</rdf:li>
     <rdf:li>Action</rdf:li>
     <rdf:li>Bagnese</rdf:li>
     <rdf:li>Chinese lantern</rdf:li>
     <rdf:li>Emilia Romagna</rdf:li>
     <rdf:li>Europe</rdf:li>
     <rdf:li>Italia</rdf:li>
     <rdf:li>Italy</rdf:li>
     <rdf:li>Modena</rdf:li>
     <rdf:li>Natural Phenomenon</rdf:li>
     <rdf:li>Object</rdf:li>
     <rdf:li>WHAT</rdf:li>
     <rdf:li>WHERE</rdf:li>
     <rdf:li>flying</rdf:li>
     <rdf:li>fog</rdf:li>
     <rdf:li>weather</rdf:li>
    </rdf:Bag>
   </dc:subject>
   <Lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>ATTRIBUTE|Action|flying</rdf:li>
     <rdf:li>WHAT|Natural Phenomenon|weather|fog</rdf:li>
     <rdf:li>WHAT|Object|Chinese lantern</rdf:li>
     <rdf:li>WHERE|Europe|Italy|Emilia Romagna|Modena|Bagnese</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>

I am only interested in importing the last (hierarchical) version of the tags (“hierarchicalSubject”), but dt (when it does) imports both the hierarchical and the flat versions, creating a huge mess in the tag database. Is there a way to tell darktable to import only the hierarchical version? Alternatively, has anybody already devised a script that can go through a directory of xmp files and strip away the non-hierarchical xmp bits?

Finally, I would like to post a humble request for introducing a way to delete multiple tags at once in darktable. Everytime I managed to import some tags and the hierarchies get flattened, I find in the tags list dozens (if not hundreds) of duplicate tags and it is really cumbersome to delete them one by one by selecting them with a mouse right-click and choosing the appropriate option.

thank you in advance for your attention and any help

giuseppe

What’s the difference between 1 and 2?

I am sorry, cut and paste error. The second way tags are encoded in the xml file by lightroom is enclosed by:

<lr:weightedFlatSubject>
    <rdf:Bag>
     ...
</lr:weightedFlatSubject>

and contains the same tags as 1) but in a different order (I am sorry I cannot post the content right now as I am on a different computer and don’t have that xml file available here). From the information I could gather on the web about this:

Apparently this is an Adobe-created tag: “LR 11 added a very limited ability to order the keywords for export to Adobe Stock but not other stock agencies.

So, the 2) way of encoding tags in the xml, doesn’t seem very useful and can be practically ignored.

All tags starting with “lr:” are lightroom tags. You should have a line like
xmlns:lr="http://ns.adobe.com/lightroom/1.0/" near the beginning of the xmp file. So <Lr:hierarchicalSubject> is also a lightroom tag. A program reading such an xmp file can decide which tags to use, and should ignore tags it doesn’t understand.

I’m not sure darktable allows you to define which tag(s) to use for importing keywords. Might be a useful option perhaps (e.g. Digikam allows you to define the xmp tags to read/write, and in which order to try them while reading).

Yes, that would be very useful, thanks for the suggestion!

Any hint about the other points in my post?

Ok, I will reply to part of my own query :wink: as I found out a quick unix snippet to strip the non-hierarchically encoded keywords from the Lightroom-exported .xmp files.

Open a terminal and cd into the directory where the Lightroom-exported images with their .xmp sidecar files are. Then enter these two lines of commands:

sed -i '' -e '/<dc:subject>/,/<\/dc:subject>/d' *.xmp
sed -i '' -e '/<lr:weightedFlatSubject>/,/<\/lr:weightedFlatSubject>/d' *.xmp

and all the xmp files in the directory will be changed in place. Now, when using “Import/Add to library” in darktable, only the hierarchically-organized keywords will be imported.

I hope this is helpful for other people dealing with the same issue migrating from Lightroom to darktable.

Still, I don’t understand why, if I use instead “Import/Copy and import”, neither the development settings nor the tags get imported… anybody else noticed this?

cheers
giuseppe
Still, I am wondering why

I think, the copy will copy the raw files to the new location without the xmp. A new xmp will get created in the location per your darktable settings.

You should definitely exercise extreme caution when using string manipulation tools (sed, perl -pie, etc) on XML files, as you may end up with some invalid XMP files if you’re not careful.

I’d highly recommend an XML aware tool like Saxon, xmlproc, or lxml for manipulating XML files.

1 Like

Yes, it does seem like this is the actual behavior: but is it how it should be? I am not sure I understand the rationale behind it…

Thank you for the heads up. I had seen similar warnings on the web, but the sed command seems to work well in this case. I’ll have a look to the other tools you suggested, in any case.

I think the functionality of the copy and import is mainly for moving images from cards to harddisk. In those cases, it moves the raw or raw + jpg. I don’t see the use case of also moving xmp since I don’t know of a camera that creates them.

If these Lightroom images are developed with an xmp, why not use the Add to library functionality? What do you gain from the Copy and import?

I see. The functionality I miss by not using Copy and Import is that I cannot have anymore the files renamed and moved to a directory that is created based on the exif data (in my case yyyy/mm/dd). When I am exporting a large number of images (~ 10 years of photos) it is cumbersome to have to manually create a directory for each day and then export from lightroom the photos from that day into that directory.

thanks

So you want to rename the raw and the associated xmp using the exit data. I think you can do that with xnview using the companion settings.

Yes, and thanks for pointing me to xnview, but actually renaming is not the important part and I could even skip it. What is crucial for me is to automatically create directories of the form YYYY/MM/DD based on exif and transfer the imported files there. This would be very cumbersome to do manually for a multi-year set of files.

I think xnview mp can do the folders based on exit data.

exiftool can create directory structures and rename files according to metadata. I’ve been using it for years to move files off of SD cards.

1 Like

Thank you, I’ll have a look at exiftool too then!