Autotagging with Digikam

Pierre7602 · March 28, 2025, 8:59pm

Hi everybody !

I would like to know if some of you use this feature in Digikam.
After upgrading to 8.6.0, I did some tests on an Album containing few photos.
I found it quite slow and giving sometimes no or non relevant results.
Maybe it needs to be tweaked a bit or to learn by examples just like face recognition.
So, any experience or piece of advice are welcomed as it seems to be promising.
Regards.
Pierre

Michmill · March 29, 2025, 11:04am

Hi @Pierre7602,
Yes, Autotagging can be slow. I recommend “Work on all processor cores” is enabled as this will speed things up considerably. Also, are you collections on a local drive, or some type of remote storage like NAS? Using NAS can also slow things down significantly.

As for models, I think YOLOv11 XLarge is the most accurate, but YOLOv11 Nano is by far the fastest. EfficientNet B7 is good for tagging the overall scene, and the YOLO models are good for finding objects in the image.

Yes, I use autotagging for all my images, but I’m also a bit biased. I wrote the autotagging feature.

Cheers,
Mike

Mike_Bing · March 29, 2025, 11:11am

Extremely slow, English only and not very accurate either. I tested it but I’ll wait for a few more decades for it to become better…

betazoid · March 29, 2025, 11:15am

I can confirm this although I didn’t test it recently.

Michmill · March 29, 2025, 2:08pm

Hi @Mike_Bing,

Yes. I agree. The models we use are the best available for local use. There are better ones, but they require sending the image to the could for processing. We take your privacy seriously, so we really try to not use any cloud services.

The only embedded cloud service we use is the tag translator. You can select to have the autotag translated when it is detected.

There is another option for non-english tags, but it’s quite technical. Any changes made this way will be overwritten when you install a new version of digiKam.

Cheers,
Mike

Pierre7602 · March 30, 2025, 8:23pm

Hi Mike !

My collection is on a NAS. I’m always wondering myself if it would be better to have my collection on a local drive or on a NAS. The NAS is more silent and its capacity can be expanded. But that’s another subject.
I’m not sure to have checked “Work on all processor cores”.
Thank you for giving us more information about the models which are used.
Would you recommend using different models one after the other to get better results ? For example, EfficientNet B7 followed by YOLOv11 XLarge (or vice versa) ?

Cheers, Pierre.

Michmill · April 1, 2025, 11:59pm

Hi @Pierre7602,
The order of the models doesn’t affect the results. Just know that YOLOv11 Nano and YOLOv11 XLarge will write the same tags, but YOLOv11 XLarge will typically be more accurate and will find more objects in images.

“Work on all processor cores” is recommended for Autotagging and Face Recognition. Those functions use new pipelines added to digiKam in 8.6.0 that are much more efficient, take advantage of GPU processing, and are tuned to allow you to do other actions in digiKam while the process is running. Typically a user won’t see much of a performance slow-down doing other functions while the autotagging or face recognition are running, even with “Work on all processor cores” checked, but the process will complete much faster.

Cheers,
Mike

DazzyWalkman · April 27, 2025, 2:44am

I confirm this. This is a useful feature, but needs some polish.
I am using autotag for the last 2 days with digikam 8.6.0, EfficientNet B7 model.
Yes, it’s slow, while I do a full set face-detect/recognition for one day, 20% progress vs. 0% for autotag.
And the translate feature does not seem to work, even the service provider in the settings claimed to support my language. I have yet to troubleshoot whether it is a server-side problem.
I are more concerned about a “hide (auto, or some other type) tag” feature, because all tags would be shown on people view, that may cause some frustration when my screen is not big enough.

UweOhse · April 28, 2025, 4:21pm

Results are getting better, but i’ll abstain from using any auto tagger until they achieve some kind of consistency in the predictions.

This has been recognized as sheep:

and this as a bear:

I don’t mind inaccuracy, but for my use case inconsistency makes auto-tagging worse than worthless.
Right now i’d find the goslings from that one shooting under untagged, bird, hare, bear, sheep, dog, badger, pelicans or goose.

The grey geese were recognized as american coots, pelicans and red-breasted merganser.

Fixing that would take me longer than tagging by hand.