Lightroom to Darktable, bringing over location metadata (not gps!)


(Alessandro Amato Del Monte (Aadm)) #1

Hi everyone. I have struggled a lot with the decision, with the painful move from Aperture to Lightroom still fresh in my mind. But after Adobe stopped working on the standalone version of LR and before it’s too late I’ve made the big jump, from mac+lightroom to linux+darktable. I trust in open source and hope that DT can continue to exist for many years to come.

I am currently working to avoid losing all the metadata I have filled in over the years, with the idea to recreate all collections that I had in Lightroom (vacations/country_A, country_B, etc). My LR database consists of over 70k photos.

Stars and tags are ok, but I’ve noticed that the Location and Country keywords are not copied when importing raws and their xmp sidecars over from Lightroom.

I have come up with a bash script that uses exiftool to copy the content of those two keywords into hyerarchical tags:

#! /bin/sh
raws=(".NEF" ".RAW" ".ORF")

OIFS="$IFS"
IFS=$'\n'

for RAW in "${raws[@]}"; do
  for FILE in `find . -type f -iname "*${RAW}"`; do
    INPUT=$(basename ${FILE} ${RAW}).xmp
    OUTPUT=$FILE.xmp
    LOCAT=$(exiftool -t -S -xmp:location "${INPUT}" | cut -d":" -f 1)
  	COUNT=$(exiftool -t -S -xmp:country "${INPUT}" | cut -d":" -f 1)
    if [[ $LOCAT ]]; then
      exiftool "-xmp:Subject+=$COUNT" "-xmp:Subject+=$LOCAT" "-xmp:HierarchicalSubject+=$COUNT|$LOCAT" $OUTPUT
    fi
  done
done

IFS="$OIFS"

What this script does is to extract Location and country from the original Lightroom sidecar and, if they do exist, write them as normal tags to the xmp.Subject field (for example: Italy,Rome) and as hyerarchical tags with first level set to “Location” to the xmp:HierarchicalSubject field (for example: Location|Italy|Rome).

The files I’m importing are Nikon, Olympus and Fuji’s raws (respectively NEF, ORF and RAF extensions).

When the raws get imported in Darktable these tags should also be read and copied as normal tags that I can then use to make collections, albums or whatever they’re called (still need to figure out this part).

I have random questions and things to do:

  • is exiv2 a better choice for this task? Will need to figure out the syntax in that case. I have the impression that darktable uses exiv2 because is faster perhaps? Also I seem to be unable to read Darktable tags (once the file is imported) while I can see them using exiv2 (Xmp.dc.*; I think that the keywords where DT stores them are Xmp.dc.Subject and Xmp.lr.hierarchicalSubject).
  • need to add a loop for jpeg files too

If anybody is facing similar problems and wants to collaborate or help me with the above please let me know.

thanks!

UPDATE I have modified the script above and it has worked on a test folder, I am now trying it on another folder with >4000 files and will let you know the results tomorrow.

The main changes have been on the find line at the beginning of the loop to also take into account names with spaces (together with the IFS and OIFS lines), I have also removed the overwrite_original option in exiftool so that if anything goes wrong you only need this command to restore the originals. Finally, I changed the way to add tags to xmp.Subject and xmp:HierarchicalSubject so that existing tags are not overwritten (but I simply append country and location).

exiftool -restore_original -ext xmp 

while to delete the originals once everything works as intended:

exiftool -delete_original -ext xmp 

(Mica) #2

Hello & welcome! I think we have a helpful crew here to assist you. :smiley:

Since the source is open, you can always have the latest version as long as you want. There are several dedicated developers on the project and it is mature enough that I don’t think it is going anywhere!

Your script looks like it should do the trick. If you have the XMP sidecar files (or if you can coax lightroom to write sidecar files), darktable also use XMP sidecar files, so you could just operate on the XMP (XML) files instead of your raw files. That’d certainly be much safer than operating on all your raw files.

Looking at one of my XMP sidecar files, I see :slight_smile:

<x:xmpmeta>
...
<rdf:RDF> 
 <rdf:Description>
 ...


   <dc:subject>
    <rdf:Seq>
     <rdf:li>tag</rdf:li>
     <rdf:li>test</rdf:li>
     <rdf:li>yeah</rdf:li>
    </rdf:Seq>
   </dc:subject>
   <lr:hierarchicalSubject>
    <rdf:Seq>
     <rdf:li>test|tag|yeah</rdf:li>
    </rdf:Seq>
   </lr:hierarchicalSubject>
  </rdf:Description>
 </rdf:RDF>
</x:xmpmeta>

I’d think they’re equal at this task; either will do. I think programs choose exiv2 because it is all in C++ (or C) so it is faster, while exiftool is in perl. But reading files from disk will probably be your biggest bottle neck, not so much execution speed.

You seem to be correct, see my example above.

Sure! I don’t have this problem, but I’d be happy to help however I can.

Again, if you can get Lightroom to dump XMP files, then manipulating the XMP files will be fast, easier, and safer than operating directly on your raw files.


(Tobias) #3

Open source is like a club, the more you help the better it gets.

I think you script is a good start and a fast fix for the problem. The better solution is to improve the darktable LR import. That’s more work, harder to code and you need to wait for the next release, but then it works for everyone.

@phweyland is at the moment trying to improve the LR import module in darktable. Perhaps you can work together and help to improve it.

Have a look here:


And here:


(Morgan Hardwood) #4

The script is broken and potentially dangerous. Please review it.
If needed, I can give you a hand.


(Alessandro Amato Del Monte (Aadm)) #5

Thanks for the comment, can you show me where exactly the script is broken? I’m going to review it later but if you have already noticed something very evident it’d make easier for me.


(Morgan Hardwood) #6

@aadm sure, I’ll do that tonight - a few hours from now.


(Alessandro Amato Del Monte (Aadm)) #7

Thanks everyone for the comments (also @Morgan_Hardwood for pointing out the potential dangers; I have now realized that using “overwrite” in exiftool is perhaps not very clever, and also I was overwriting existing tags instead of simply appending: I have modified now the script and it has worked correctly on a test folder).


(Mica) #8

I’ll plug the XMP route again! Make XMP sidecars, darktable read them. If Lightroom won’t write them out, exiftool will!


(Alessandro Amato Del Monte (Aadm)) #9

About using xmp sidecars, that’s exactly what I’m doing here. Reading the original xmp from lightroom, extract country and location then write them in the darktable xmp. I loop over the actual raw filenames and I use them only as a guide (see in the script where I define INPUT and OUTPUT).

Thanks again @paperdigits for the replies!


(Mica) #10

Ah, I didn’t catch your updated post!


(Morgan Hardwood) #11

@aadm ShellCheck will tell you most of the things which need to be corrected:

In addition to what ShellCheck will explain, I will add that:

  • It’s a Bash script which requires Bash, but the hashbang does not reflect that.
  • Uppercase variables by convention means they are environment variables. These are not envvars, so they should not be all uppercase.

Disclaimer: I had to guess what the script it supposed to do from your description, and I don’t know what tree and filename structure we’re dealing with. I’m assuming it could be files with spaces in their names, in “this” folder and in sub-folders:

foo bar.raw
foo bar.xmp
photos/cat dog.nef
photos/cat dog.xmp
photos/2018-04-25 forest/squirrel nuts.jpg
photos/2018-04-25 forest/squirrel nuts.xmp

Here is a version based on your script:

#!/usr/bin/env bash
# Version 2018-04-26
# This script appends the values of XMP:Location and XMP:Country
# to XMP:Subject and XMP:HierarchicalSubject

# Extensions taken from https://github.com/Beep6581/RawTherapee/blob/dev/rtdata/options/options.lin#L15
IFS=";" read -r -a extensions <<< "3fr;arw;arq;cr2;crf;crw;dcr;dng;fff;iiq;jpg;jpeg;kdc;mef;mos;mrw;nef;nrw;orf;pef;png;raf;raw;rw2;rwl;rwz;sr2;srf;srw;tif;tiff;x3f;"

for ext in "${extensions[@]}"; do
    while read -r -d '' file; do
        # Search for the extension, anchored to the end of the string, and replace with .xmp
        sidecar="${file/%.${ext}/.xmp}"
        # If file exists and is writable
        if [[ -w $sidecar ]]; then
            # Set "country" variable with value from "XMP:Country"
            # Set "location" variable with value from "XMP:Location"
            country="$(exiftool -short3 -XMP:Country "${sidecar}")"
            location="$(exiftool -short3 -XMP:Location "${sidecar}")"
            # Process if $country tag is not empty
            if [[ -n $country ]]; then
                printf '%s\n' "Processing ${sidecar}"
                exiftool -preserve -progress "-XMP:Subject+=${country}${location:+,}${location}" "-XMP:HierarchicalSubject+=${country}${location:+|}${location}" "${file/%.${ext}/.xmp}"
            fi
        fi
    done < <(find . -iname "*.${ext}" -print0)
done

However! I think the script can be simplified. Why loop over raw file types and try to find matching XMPs, when you can just directly find all XMPs and update them accordingly?

#!/usr/bin/env bash
# Version 2018-04-26
# This script appends the values of XMP:Location and XMP:Country
# to XMP:Subject and XMP:HierarchicalSubject

arr_join () { local IFS="$1"; printf '%s' "${tags[*]}"; }

while read -r -d '' xmpFile; do
    tags=()
    while read -r -d $'\n'; do
        tags+=("$REPLY")
    done < <(exiftool -short3 -XMP:Country -XMP:Location "$xmpFile")

    printf '%s\n' "Processing ${xmpFile}"
    exiftool -preserve -progress "-XMP:Subject+=$(arr_join ",")" "-XMP:HierarchicalSubject+=$(arr_join "|")" "$xmpFile"
done < <(find . -iname "*.xmp" -print0)

As I don’t have sample files to test on, this is just a proposal which you would need to review and test.