Last year I found myself in a situation where I needed to have some 10000 pictures automatically tagged.
I discovered that all existing software either cost a ridiculous amount of money, required a subscription, and/or was cloud based.
So I decided to write an open source tool that runs on the local machine and is free for everyone to use. There are binaries for Linux, Windows and macOS.
Maybe it can be of use to you. It works like a charm with darktable, and it writes the tags it found either into already existing XMP sidecar files or creates new ones.
(divis.io is the company I work for and has kindly supported me in this endeavour)
Let me know if you like it
Kind regards,
Stephan
P.S.: For those of you (undestandably) skeptical of AI, I share your concerns. Here’s a little excerpt from the FAQ:
But isn’t using AI bad for the environment?
It’s important to understand that not everything utilizing AI has to do with LLMs running on multi-billion dollar server farms owned by billionaires. STAG uses a very small convolutional neural network (CNN) which does not even need a GPU to run fast and efficiently. You can run STAG on a perfectly normal computer and not draw more power than your Adblocker needs for making the internet bearable.
I ran it on my system and it gave a very strange output and did not find any image files. Furthermore the output typeface is tiny and I had to use the zoom aid to read it.
I’m very sorry that it doesn’t work for you. Could you maybe elaborate as to what strange output you are seeing?
I can’t be sure what’s happening, but if the output is something in the form of “ram_plus_swin_large_14m.pth… 10%…”, that’s the initial model download.
When running the tagger for the first time, STAG needs to download the recognize-anything model from huggingface in order to be able to run locally on your own machine only. As the model is 3.2GB in size, this process can take a few minutes. The model is only downloaded once.
In theory it could be made possible in a future version. The list of tags is at the moment residing in the recognize-anything module. I’d have to take it out of there and make it modifiable.
But it would not be a matter of adding and removing tags (as the model is trained with these specific tags and their weights) but rather a matter of restricting which tags the model can generate.
@Stephan, I tried Stag today and have a couple of remarks.
My daughter was tagged as “pirate” because she wears an earring similar to Johnny Depp in his Pirate of the Caribbean films!!!
I started using the commandline version with stag-py dir. A bit slow on my old pc, but the xmp files were generated. gThumb showed the labels, ART did not, neither Darktable, nor Digikam. I looked everywhere in Darktable for the option use darktable-compatible filenames but couldn’t find that.
So I decided to make the executable for Linux et voilà, that option is in Stag itself apparently! Once checked I ran Stag again and now the tags appear in Darktable, Digikam and ART, but not in gThumb anymore. Couldn’t find that darktable-compatible option in the --help of the commandline version…
Then about the tags themselves, they are quite generic like woman, person, man, format, jpg, picture, photo, boat, etc. I’m not sure how useful this is. Stag did recognize the Eiffel tower though (but did not add the tag Paris), but not the Centre Pompidou/Beaubourg, although it recognized the escalator at the outside of the building.
My main point of criticism is that Stag generates too many tags that are in my opinion not that useful. Example of the photo with the Eiffel tower: Tags found: ['bend', 'building', 'Eiffel tower', 'person', 'man', 'stand', 'umbrella', 'woman']
When I search my photo collection on tags, I do not search on umbrella or person but on Eiffel tower or Paris. According to me tagging a large photo collection is an art in itself and I am not convinced that AI can help a lot here.
Having said that, I do appreciate your efforts Stephan! As Stag says at the end of its work “The mighty STAG has done its work. Have a nice day”, I wish you a nice day as well!
many many thanks for testing so thoroughly and sharing your thoughts.
Yes, the recognize-anything model recognizes a lot of generic tags (like “person”, “woman”, “man”, “dirt road”, “building”, etc.), and it’s not at all suited to recognize specific things.
But that’s also sort of the use case. I can quickly look at all the images with dogs in it, which otherwise would take me days. Or the images with something red in it. Or images where I was in a green field.
Then I can take over and narrow down the search and tag my images in earnest, the way I’d like to have them organized.
(that’s also one of the reasons why I’m providing a prefix for stag generated tags: It’s meant to be an aid in tagging your image, not as a replacement for the creative person who makes the final choice about a tag ;-))
In regards to the darktable-compatible option in the command line version: It’s the “–prefer-exact-filenames” option.
Great tool. I quickly checked out the git repo and used the CLI tool instead of the GUI. Mainly because the letters are really small and hard to read. I don’t know if I will use it. As others said, some of the tags are rather generic, but as you pointed out yourself, that’s kind of the point. I might use it with a prefix. In my test run I tried to set the prefix to “stag|” because darktable would create stag as a folder. Unfortunately that didn’t work and it crashed.
The thing I would be more interested in is face recognition. I know there is a plugin for darktable, but the documentation isn’t very good.
Regarding the crash you are experiencing: You must not include the “|” sign in your prefix, stag adds it automatically to a given prefix. I’ll fix the crash in the next release.
Kind regards,
Stephan
P.S.: I also find face recognition very interesting and useful. Maybe that’s what I’ll tickle next
Very interesting @Stephan , well done, full marks. I probably won’t try it, not yet anyway, I spend too much time at my PC as it is. Sounds like this could get really powerful, my photos are not tagged, maybe same for a lot of people.
The trouble with digikam (I last tried a few years ago) is that it shows me all photos with faces, not only those with recurring ones. That means, I was given thousands of photos with thousands of faces, and my family members were simply buried in the sea of strangers. I gave up filtering and tagging after a few hours.
A practical thing, when I click Browse to open a directory, stag defaults to the folder where stag is stored. Then I go up to /home and see a lot of hidden files. An option “do not show hidden files” would be practical, as well as “go to last visited folder”.
I think stag would be way more useful when a user could select which tags not to use, like person, woman, man, air, water, building, picture, photo, jpg, etc. Those are, imo, too generic to be useful.
I downloaded some photos of well-known places and a famous painting: the Brandenburger Tor (Berlin), the Brooklyn bridge (New York), the Big Ben in London and the Mona Lisa, Paris. Only the Big Ben was tagged as such. So the Recognize-anything-plus-model that stag uses does not recognize everything! On the other hand, when I upload the Mona Lisa image to a French AI machine (Mistral), it gives detailed info about the painting.
So Stephan, some homework for you: after you have finished the face recognition module, start working on a famous building recognition module coupled with city recognition, as well as a famous paintings module. To start with…