Avoiding duplicates for massive import

Any suggestions on the best practices and tricks to avoid duplicate images? I have some old aperture and photo libraries from a mac I am decommissioning. I would like to import them into darktable. It appears that darktable will avoid reimporting images if they are named the same. But will happily import the same images if the are named differently. Or is it best to do some prescreening outside darktable. Any help or suggestions appreciated.

I have a similar situation to resolve for myself right now. I haven’t actually done it yet, but I’ve done some dry runs of a procedure based on ExifTool.

https://exiftool.org/faq.html

  1. “How do I export information from exiftool to a database?”

ExifTool can produce a CSV file with the columns in the same order as the arguments on the command line.

someguy@somesys:~/Pictures$ exiftool -s -r -csv -filename -createdate -directory -filesize -exposuretime -fnumber -iso -shutterspeed . > ./pictures-index-somesys.txt

So I can run the same command on both systems (to separate files). Then I can pull both files into a spreadsheet or a database and slice and dice as I please to identify duplicates.

Having said all that, I’ve never even touched a Mac in my life.

Edit: …and there may be some ready-made tool that identifies duplicates that I don’t know about!

You could use Digikam to do it if you can’t find anything else.

If you have a Linux machine it looks like rdfind will do it!

so I think I have found a solution that I like. I need to do more testing but it appears that

rdfind -deleteduplicates true -ignoreempty false dir1 dir2

will delete duplicate files in the two directories. Following this with

find dir2 -empty -type d -delete

will delete empty directories which rdfind leaves behind. It also appears that rdfind also has the ability to create a mac time machine like setup. Where if dir1 is a backup of dir2 then it can create symbolic links between the directories. Useful if you want to have two directories where the files are the same but organized differently. But I digress,

So the plan would be to use those commands to reduce the folders to unique images. Then I could use darktable to import the folders (of images) and rename and organize by exif and file data.

1 Like