Git annex use in workflow

LFS and annex are similar and try to solve a similar problem, but:

  • with LFS, the files aren’t actually stored in your repo
  • LFS is centralized and requires hosting to use
  • annex has way more features and possibilities.

I still haven’t dared to feed my photo collection to git annex. I installed it way back and had a more complex setup of devices. During the experiments I shot myself in the foot quite a few times…

In theory it should save you from foot shooting, in practice not so much. I want to use it because as is I’m locked into editing on my desktop. I also can’t easily choose which files should be on my laptop.

I do use it for my rather vast collection of ideas and references. Many of them images added using addurl. I find it helpful how the url gets saved with the metadata.

Maybe one day i dare init my img folder.

My major recommendation for using git annex is to have numcopies=1 or more and to sync often. That way it makes is pretty hard to loose data. I thinkni have like seven full copies of my repos :slight_smile:

@chris can you describe what you’re trying to do with find? Have you tried whereis?

With

git annex find --want-drop --in .

and

git annex find --want-get --not --in .

I try to find out which files will be transferred or dropped to find out if a certain rule for preferred or required content works as intended. Unfortunately, when I then run the sync command, most of the time the behaviour is different than what was expected.

Sorry for kicking an old topic, but this seems like a good place to ask:

Today I played a bit with git-annex in a separate temp folder and it looks like a promising way of managing my raws (and version the xmps while I’m at it). One problem I have is that I have my photo collection on an NTFS drive so I can also access it on Windows, but git-annex insists on having two copies of each file, in the annex and the working directory.

This effectively halves my disk space. What am I doing wrong? Or is NTFS just not supported properly?

NTFS isn’t well supported since it doesn’t support Unix symlinks. Usually when you issue the command git annex add <some file>, the hash of the file is computed, and the file is moved into the annex in.git/<hash path>/<git address> and a symlink is placed in the working directory that points to the file in the annex.

I’ve not really been following git annex but I thought there were changes that worked around the issue. V7 or whatnot? I might be mixing things up.

Dang it, NTFS doesn’t do symlinks and worse, doesn’t play nicely with read/write permissions. That’s annoying, because it is the file system that is most compatible with other systems and OSs. Or are there alternative FSs for usb-drives that I could be looking at?

git config annex.thin true

The above has safety implications check the documentation to see if its worth it for you.
https://git-annex.branchable.com/tips/unlocked_files/

2 Likes

That’s actually quite a good idea! I’m not modifying the raw files anyway. Thanks!

Here is a nice video from SCALE about this specific workflow:

Now, somebody asks a question at 20:32 about some similar thing to git-annex called Adica, Attica or something similar.
It seems to be a Rust app or lib so can someone point me to it? I’d like to take a look at that app since I don’t know Haskell.

EDIT: Oh wow, I’ve just realized that this is you @paperdigits :smiley: Very cool talk man! :smiley: :smiley:

Hey thanks, that is me.

I think the rust project you are asking about is GitHub - rdrsss/attic-redux: Attic...Again! and I actually went to that talk… Not production ready and a little bit more “cloudscale storage solution” than is necessary for my photo collection. Not a knock on that project, just that my photos aren’t web scale in size.

Oh well, the project seems to be dead anyway xD Thanks for the info :slight_smile:

Now I’m back to git-annex vs git-lfs again.

Git LFS is not the answer to this particular problem, it doesn’t store the files in git at all…

So far as I’m aware, git LFS needs a server to synchronize files.

The thing that I’m worried about is git-annex has 37805 commits from 57 contributors, 3 watchers, 27 stars and 0 forks while git-lfs has 7729 commits from 161 contributors, 424 watchers, 8400 stars and 1600 forks.

It seems to me that investing time in developing some “solution” around git-annex would be a temporary one because, although it’s very active now, git-annex will probably be dead in a few years.

This is of course if one was to develop a user friendly abstraction for managing photos via git-annex and possibly integrating it with the raw editor of choice.

And yeah, git-annex seems perfect for the job, but I don’t envision anyone willing to take on the maintenance of it in a few years just because it’s not supported by the big git companies like GitLab, GitHub or Atlassian. It seems to be a crowdfunded thing with a few foundations supporting it on and off and and largely on the shoulders of a single one developer. That said, it’s such a cool piece of software.

Now, this is my first time thinking about this so I might be very very wrong.
What are your thoughts?

edit: It seems I’ve been looking at someones mirror of git-annex at GitHub :see_no_evil: :see_no_evil: :see_no_evil:, they are actually self hosted. So the stats I mentioned in the beginning are invalid but still…

I think that “social” stats for code are good to tell if a project is absolutely dead in the water or not, but otherwise fails down the the lowest common denominator of group-think and cool-kid-mentality that all other social media falls victim too.

Is the person with the most followers/likes/comments on instagram the best photographer? Certainly not.

Also keep in mind that the primary development repo for git annex is not on github, but github is the primary place for git-lfs.

When I look at git-annex, I see software that doesn’t rush its features, has been maintained for a long time, with a good track record for code quality, community, and transparency. It is quite a bit older than git-lfs, which was just a spec with no implementation by the time I started to use git-annex. With git-annex I see someone who wrote software to solve his own need, shared it, found a funding model that does keep development moving and also doesn’t depend on silicon valley VC money (as both github and gitlab are beholden to). The community isn’t huge but is strong. It gains features as fast as it needs them, doesn’t have to wait for spec nor for large corporations to approve employee time to do things.

Looking at the future is always a good idea when considering a long term solution, and I personally haven’t seen anything in git-anenx that’d make me think it will be abandoned by its primary author in any time soon. For my needs it is feature complete, but I prefer the command line and to know all the commands for my software, rather than running automated processes.

You can take that with a grain of salt, as I’m clearly biased towards git annex.

What do you see that makes you think git-annex will be dead in a few years?

What solution do you think needs to be put on top of it to make applicable for your use?

1 Like

Yup, I actually completely agree with you when you put it that way. I’ve just been reading a ton of articles and comments in favor of git-lfs over git-annex and even GitLab removing the support for annex. Which I cannot quite wrap my head around why is everybody jumping to lfs when lfs seems to be really inferior solution to annex.
Lfs seems to only be good for the enterprise use cases. Anything else to make the life easier for a small dev or to make it’s use case more flexible is missing and not even on the road map.
Like… what’s the catch?

But yeah, I’m exploring git-annex right now and it’s a really really cool tool!

Not just my use, but I think it would be nice to have a better web interface for the remote host targeted specifically at photographers and a client integration with the raw editor. So you can see all local and remote photos right from Darktable for example and do pushing and pulling.

I’m a web dev so I was thinking about challenging myself to develop something that would move us in that direction. Start the ball rolling so to speak :slight_smile:

2 Likes

The catch, as I understand it (and I may be wrong now because I haven’t looked at LFS in a while) is that you need a git hosted solution to use LFS, so you need GitLab or GitHub or Gitea. So far as I’m aware, if i had a git lfs repo on my filesystem and cloned that repo to another place, then I wouldn’t be able to pull the LFS stuff over to that local repo. With git annex, you can.

Also if you need good, barebones git annex hosting, you can use SSH or the hosting software gitolite.

1 Like
1 Like