git lfs migrate is NOT doing what I expected.

I understood that I would be rewriting the history so that it would be as if the old commits were LFS commits.

But instead, now I have EXACTLY TWICE as many commits as before (?) So it's as if it left all the old commits in and added a new one for each.

There's more data in LFS, but the same amount in the regular object store.

And that's after running the cleanup/garbage collection the tutorial asks for.

This is very frustrating.

Follow

Hmm. Perhaps it's the reference to the origin remote that is causing trouble.

I'm trying it again, after deleting the remote first, since the goal here is going to be to replace the remote, anyway.

· · Web · 1 · 0 · 0

Nope, that wasn't it.

Still seeing the same behavior.

I can't find anything that definitely tells me this is wrong behavior, but it can't possibly be right, as it makes the problem LFS is supposed to solve actually get worse.

My opinion of Git is not improving!

I may have found a Stack Overflow with somebody encountering the same problem. Sadly, all the answers are clueless responses from people who didn't read carefully and keep making claims already proved false:

stackoverflow.com/questions/51

Specifically, in that case, as in mine, the original repo contents stored in ".git/objects" have not decreased, or decreased by only a few percent, even after garbage collection is run.

But the LFS data in ".git/lfs" HAS increased. So the whole repo has about doubled in size.

When this is pushed up to the server (Gitea, in my case), the LFS data will go into some other storage, which is great, but the original repo is just as large as before.

Okay. So I've changed a few things to try to follow the examples more closely, and I think I've got it working better.

I had previously just copied my repo to a test directory to work on it, but this time, I used Git to clone it -- so I guess that's not just a simple copy.

I also deleted all of the working files from the directory, so that only the repo itself remained.

Also, on this time, I verified that the lfs directory was empty.

[...]

This at least gave me some actual results: most of the filetypes I had specified were converted to LFS storage.

One annoying thing is that the storage still grew a lot, and the growth remained after running the cleanup and garbage collection.

But the regular repo is now much smaller -- 1.1GB now. But the LFS storage is 25GB, compared to 10GB for the same data in the regular repo.

So I guess that implies the LFS storage is a little less efficient?

[...]

Actually about 2.5X less efficient.

This is probably still a win, because the LFS data will be stored in S3 object storage, which is a lot cheaper than local storage on the server.

But it's interesting. I wonder why that happens?

@TerryHancock question: did you prune the reflog?

I assume you did a 'git gc' afterwards to clean up it's objects. But that only cleans up objects with no references reachable. The reflog is a list is every position HEAD has been in, and that maintains references. You can check the docs for how to expire it pretty easily, then GC again.

@tek_dmn

No, I did those both. It's in the documentation.

I think the real problem was that this was not a clone of the original repo, but a copy. I'm not sure why that would matter, though, as the original is a clone.

I eventually got a better result.

Weirdly, the LFS storage is about 2.5X larger than the files in the repo that it replaced, though, which is a little odd. I'm assuming that's a difference in how the data is packed.

But I'm putting that into cheaper storage.

@tek_dmn

The commands suggested to cleanup are:

$ git reflog expire --expire-unreachable=now --all

$ git gc --prune=now --aggressive

@tek_dmn

Naturally, I'm doing all of this testing on a copy of the original repo before I attempt it on a real one.

But I originally just made a file-system-level copy:

$ cd Test
$ cp -arp ../MyRepo MyRepo

Later, I decided to use Git to do it, just in case, and that does seem to make a difference:

$ cd Test
$ git clone ../MyRepo MyRepo

I suspect this was the source of my problem, as well as the StackOverflow case I found.

I also made sure to remove all remotes from the repo.

@tek_dmn
I also had a notion that the working copy of the files might be creating an issue somehow, so I deleted them all, leaving only the .git repo directory.

And then I ran the migrate commands, and that seemed to give the expected results.

@TerryHancock I do believe a local clone of a git repo uses hardlinks, plus a clone operation does automatically set the 'origin' remote to wherever you cloned from

@TerryHancock basically. Or more aggressive, use --expire not --expire-unreachable.

The reflog is really an undo list for each ref, so if you make an errand change you can back it up to a previous state. But wiping it out means those commits are no longer referenced.

Sign in to participate in the conversation
Mastodon.ART

Mastodon.ART — Your friendly creative home on the Fediverse! Interact with friends and discover new ones, all on a platform that is community-owned and ad-free. Admin: @Curator. Currently active moderators: @ScribbleAddict, @TapiocaPearl, @Otherbuttons, @Eyeling, @ljwrites