There’s a good deal of confusion about Git, Ian’s Use Mercurial You Git article is a good example of it which I’d like to address point by point. But first, I’d like to say that I’m giving Ian the benefit of the doubt. I don’t think he’s intentionally trying to mislead people I think he simply doesn’t understand Git very well, and that’s not his fault.
Git has two problems that face new users. It suffers from the Blub Paradox and the documentation tends to assume that you already “get” how radically different Git is. So people end up applying the assumptions of how other version control systems to how Git and end up becoming confused and frustrated like Ian, because while some of the commands look similar, what Git is doing and how it does it are dramatically different from other systems, but that’s a good thing. It’s what makes Git so much better than the competition. It’s kind of like how Struts (Java web framework) and Ruby On Rails (Ruby web framework) are both doing essentially the same thing and processing the same parameters from a web server, but if you try and program a Rails app like you program a Struts app you’re in for a world of hurt.
The basic version control concepts are the same in Git, but at the same time, they’re not. Most version control systems are concerned with managing files, or metadata about files. Git doesn’t even care about files. Yes, it seems like it does, and you can even use it as if it does but it really doesn’t. The key for me to really buying this statement was when I realized that, if I really wanted to, Git would let me pick and choose between the various hunks of a file and only add the ones I wanted to the next commit.
My advice to new users is simple. To get started, act as if it’s like the version control system you’re familiar with, but as soon as something doesn’t work the way you’d expect take the time to learn what is really happening and why Git is doing it. Every single time this has happened for me I have obtained another “aha!” moment which gave me a significantly better understand of Git as a whole and showed me how truly awesome git really is.
So, with that said, let’s look at Ian’s complaints:
When talking about Git’s usability Ian complains that Git installs “nearly 150 distinct binaries” and claims that “Mercurial has one.” What Ian doesn’t mention is that while Git may install nearly 150 files you only have to reference one of them directly just like Mercurial which installs over 150 files (a number which roughly doubles as each .py file begets a binary .pyc file after it gets executed).
But, this is a straw-man argument because it doesn’t matter how many files an application installs, and it doesn’t matter if they’re binary or not. What’s a much more important measure is how many you have to interact with directly and in both cases the answer is one.
Ian uses Git’s rebase command as an example of why Git’s usability is bad. Rebase is one of the most dramatically powerful tools in the Git arsenal, it’s also something you never have to use if you don’t want to. It allows you to reorder, merge, and exclude any / all of the commits in your history.
But guess what? Ian doesn’t mention the fact that Mercurial has a similar functionality called Transplant which is roughly as easy to get confused about using as Git’s rebase functionality. Of course you don’t have to ever use it in Mercurial either, as evidenced by his not knowing about it. I’m assuming he wasn’t intentionally trying to make Git look bad by ignoring Transplant. The fact that the Rubinius project he was looking at requests that committers use it for it’s valuable housekeeping aspects is irrelevant to the discussion of Git’s usability.
He then goes on to say that “For day-to-day use of Mercurial, you only need hg fetch to get code, and hg commit to give code.” except commit only works on a local repo and doesn’t “give” code to anyone, and my fairly recent copy of Hg only has a “pull” command but I do see mention in Google that fetch is “…only available when you enable hgext.fetch extension…”. So, what he should have said (unless we want to include non-default commands) is that for day-to-day use of Mercurial, you only need hg pull && hg up && hg merge && hg commit to get code and hg push to give code.” If we cut him some slack and replace all that garbage with hg fetch and compare it to Git where you need git pull to get code and git push to give code you’ll see that the two are almost identical. In fact, for a surprising number of operations Git and Hg use exactly the same name for the same command.
Ian complains about the fact that git checkout can do three “massively different tasks”: change to a new head, revert changes to a small number of files, and create a branch. And from a the perspective of someone who doesn’t understand Git or version control systems I can see why they might look “massively different”. But, lets look at them from the perspective of someone who does understand Git and version control a little better.
First off, if you don’t know, “change to a new HEAD” is an odd way of saying switch to a different branch (HEAD is just a convenient pointer to the current branch). When you grab a branch from most version control systems repositories the operation is referred to as “checking out” the code/branch so when you replace the tree in your current directory with that of another branch you are literally “checking out” the code from the repository which is what every system does when you request the files from a branch. But, if “checking out” a branch was confusing for him, he could simply keep each branch in a separate folder like most revision control systems do and just change directories.
To cut him some more slack git allows you to manage multiple branches within the same repo which entails managing multiple branches in the same folder. And yes, the power to manage totally different branches in the same place, does require some new learning, and has some potential gotchas if you don’t understand how it works, but note that I said “allows you to manage” not requires. You can live in the simpler one branch per root folder world if you want and there’s no real penalty to doing so. Actually, I encourage new users to work that way. It’s a simpler way to get started. And yes, to be clear, each folder would be a different repo, but Git’s smart and hardlinks the common objects (unless you tell it not to) to save space and merging branches in across repos is just a matter of git pull instead of git merge.
When you “revert changes to a small number of files” Git, like any other version control system, gets a copy from the repository, and what do you call the action of getting files from a repository? That’s right kids, “checking out”. The fact that git calls it what it is instead of giving it some misleading name like CVS’s “update” command, which doesn’t “update” shit but instead checks out a version from the repo, should be a point in Git’s favor not a point against it.Hg calls this operation “update” too but “checkout” is an alias for it*, so I guess they get to make it work the way the CVS / SVN people and Git people expect.
His argument about checkout enabling you to create a branch is a fair one, but isn’t it nicer to be able to create the branch and check it out in one command instead of two? If Ian was really confused by this he could simply create it with “git branch ” and then check it out with “git checkout “. Personally I prefer the combined form since I’m creating them all the time.
In writing this book on Git I have repeatedly come across situations where a command took options that made it seem to do “massively different tasks” but when I spent the time to investigate what it was really doing I almost always found that the problem was simply that I had been applying behavioral expectations from other version control systems to Git without understanding the fundamentals of how things work in Git. Now, helping people to understand the its fundamentals is something that Git’s docs are not particularly good at. They tend to tell you how to execute a command and assume you understand what’s happening (at a conceptual level) behind the scenes. This is one of the things I’m trying to address with the book. “git add” is a great example of this. If you don’t really understand how Git leverages its index you’ll be bewildered as to all those “massively different tasks” you can do with it, but once you understand the index they all make sense and you realize that “add” is exactly where they should be.
He says that Git has lost the “simplicity argument”. But the basics of Git are, in fact, very simple to use. If you wanted to you could use it in almost exactly the same way you use CVS with commands of essentially similar complexity, and CVS is a very simple system. But, Git also offers you the power to do very complex things. And that’s exactly how I wish every piece of software I used was.
Let me give you an example of how simple things can be:
|checkout a repo||cvs co
|add a file||cvs add
|replace a file with the version last checked in||cvs update
|commit changes to the repo||cvs commit||git commit||hg commit|
I could continue but you get the point. For all your basic commands these systems are all equally simple. Yes there are complex commands too, but there are in all version control systems.
Ian quotes a bit of Git knowledge:
The core Git filesystem can be explained as four types of objects: Blobs (files), Trees (directories), Commits and Tags.
And then says “Unfortunately, no, it can’t. The core of Git may well be implemented as four kinds of things. But to get even the most basic tasks done, you need to know repositories, working trees, branches, remotes, masters, origins, index caches, and a bunch of other unexplained concepts.”
With the exception of the index (or “index cache” as he calls it) all of these things exist in Mercurial and you also need to understand the same ones in both. Although maybe Mercurial does have something similar to the index for staging commits, I don’t know.
You could argue that “masters” don’t exist in Hg but master is just the default name for the initial branch in a repo. They had to call it something and they picked “master”. Every version control system has at least one branch whether or not they happen to mention it, and if you only use one branch per repo you never have to even think about “master”. If you don’t want to name it “master” you don’t have to.
SVN has the idea of remote repositories just like Hg and Git and you don’t need to use remote repositories in any of them. It’s certainly helpful if you understand the concept but it’s not as if this is some obscure thing that Git came up with and no-one else has. To do basic CVS style operations you can be blissfully ignorant of the index in Git. All you need to know is to use the “-a” option when you commit to commit all your changes.
He’s totally right about not being able to look up the appropriate command in Git’s man pages without knowing what it’s called. It is sometimes hard to find instructions for what you’re trying to do in Git. This is another thing I’m trying to address in the book but that’s no help to people who don’t have the book, and no excuse for it not being better handled by the docs, but since the book will be GPL I fully expect some of what I write to make it onto the web (I may even put it there myself), and maybe into the docs.
He’s also right about Git sucking on Windows, although he doesn’t put it that way, and this is the biggest reason not to use Git. If you need to share your codebase with Windows users Git is, IMNSHO, simply not ready. It can be done with Cygwin but Mercurial is much less of a hassle on Windows. There is work being done to address this, but for now Git is primarily for Linux, Unix, and OS X.
[Update] Since this was written the Git tooling has improved dramatically on Windows and there are a number different, and good, ways to interact with it.
He also complains about Git being unreliable but doesn’t back it up in this post. I’ve seen no evidence to this effect or mention of any on the web, but I’m betting that his problems with Git’s “reliability” will be based in similar misunderstandings of how Git works. Which would mean that Git needed to work on it’s documentation to help improve people’s understanding.
Ian makes a lot of complaints about how much worse Git is than Hg, but in almost all cases if you examine the truth behind what he says you end up with Git and Hg coming out almost exactly the same. The places where Mercurial wins are Windows support and, probably, documentation. Both of these are very important, but neither of them make the tool more functional. For everyday simple usage Hg and Git are roughly equivalent tools, from a usage standpoint (assuming you understand how to use them), but I believe that Git is simply a more powerful tool that lets you go much farther beyond the “everyday simple usage”. Yes, it requires thinking differently about version control. Is that really so bad? Most of the programming tools that can dramatically enhance your productivity, or capabilities require getting your head around new concepts. Git really isn’t hard, it’s just different. Think of it like a harp. It’s simple to understand how to play and it looks like plucking it would be the same as plucking any other stringed instrument, but it really isn’t and you’re going to have to get used to it because your fingers interact with it like nothing else, and you’re not going to be able to take it everywhere either.
On a related note: Ian’s gripes are primarily founded in issues about usability the failure of Git’s documentation to help him understand it, but based on those criteria Darcs kicks everyone’s ass. I don’t recommend Darcs anymore for a number of reasons, but, with a couple exceptions, Darcs is damn easy to use and it’s not a bad system at all.
P.S. Thanks for the correction on Mercurial Lurker and thanks SJS for your notes about CVS.