»
S
I
D
E
B
A
R
«
A rebuttal to “Use Mercurial You Git”
February 7th, 2008 by masukomi

There’s a good deal of confusion about Git, Ian’s Use
Mercurial You Git
article is a good example of it which I’d
like to address point by point. But first, I’d like to say that I’m
giving Ian the benefit of the doubt. I don’t think he’s intentionally
trying to mislead people I think he simply doesn’t understand Git very
well, and that’s not his fault. Git has two problems that face new
users. It suffers from the Blub Paradox
and the documentation tends to assume that you already “get” how
radically different Git is. So people end up applying the assumptions
of how other version control systems to how Git and end up becoming
confused and frustrated like Ian, because while some of the commands
look similar, what Git is doing and how it does it are dramatically
different from other systems, but that’s a good thing. It’s what makes
Git so much better than the competition. It’s kind of like how Struts
(Java web framework) and Ruby On Rails (Ruby web framework) are both
doing essentially the same thing and processing the same parameters
from a web server, but if you try and program a Rails app like you
program a Struts app you’re in for a world of hurt.

The basic version control concepts are the same in Git, but at the same
time, they’re not. Most version control systems are concerned with
managing files, or metadata about files. Git doesn’t even care about files. Yes, it seems like it does,
and you can even use it as
if
it does but it really doesn’t. The key for me
to really buying this statement was when I realized that, if I
really wanted to, Git would let me pick and choose between the various
hunks of a file and only add the ones I wanted to the next commit.

My advice to new users is simple. To get started, act as if it’s like
the version control system you’re familiar with, but as soon as
something doesn’t work the way you’d expect take the time to learn what
is really happening and why Git is doing it. Every single time this has
happened for me I have obtained another “aha!” moment which gave me a
significantly better understand of Git as a whole and showed me how
truly awesome git really is.

So, with that said, let’s look at Ian’s complaints:

Installed Files
When talking about Git’s usability Ian complains that Git installs
“nearly 150 distinct binaries” and claims that “Mercurial has one.”
What Ian doesn’t mention is that while Git may install nearly 150 files
you only have to reference one of them directly just like Mercurial
which installs over 150 files (a number which roughly doubles as each
.py
file begets a binary .pyc file after it gets executed). But, this is a
straw-man argument because it doesn’t matter how many files an
application installs, and it
doesn’t matter if they’re binary or not. What’s a much more important
measure is how many you have to interact with directly and in
both
cases the answer is one. The confusion is probably a result of the fact
that Git actually allows you to call its component parts directly if
you really want to
, whereas Hg does not. Another
good example of this is Dreamweaver, which installs a metric fuckload
of files, including a little JavaScript file for almost every one of
the cool pieces of functionality. But you never hear newbish web
developers complaining about it because they never have to interact
with them directly. However, power users love the fact that they can work with them
directly (and change how Dreamweaver works).

Usability
Ian uses Git’s rebase command as an example of why Git’s usability is
bad. Rebase is one of the most dramatically powerful tools in the Git
arsenal, it’s also something you never
have to use if you don’t want to. It allows you to reorder,
merge, and
exclude any / all of the commits in your history. But guess what?
Ian doesn’t mention the fact that Mercurial has a similar
functionality called
Transplant which is
roughly as easy to get confused about using as Git’s rebase
functionality. Of course you don’t have to ever use it in Mercurial
either, as evidenced by his not knowing about it. I’m assuming he
wasn’t intentionally trying to make Git look bad by ignoring
Transplant. The fact that the Rubinius project he was looking at
requests that committers use it for it’s valuable housekeeping aspects
is irrelvant to the discussion of Git’s usability.

He then goes on to say that “For day-to-day use of Mercurial, you only
need hg fetch to get code, and hg commit to give code.” except commit
only works on a local repo and doesn’t “give” code to anyone, and my
fairly
recent copy of Hg only has a “pull” command but I do see mention in
Google
that fetch is “…only available when you enable hgext.fetch
extension…”. So,
what he should have said (unless we want to include non-default
commands) is that for day-to-day use of Mercurial, you
only need hg pull && hg up && hg merge
&& hg commit to get code and hg push to give code.” If
we cut him some slack and replace all that garbage with hg fetch
and compare it to Git where you need git pull to get code and
git push
to give code you’ll see that the two are almost identical. In fact, for
a surprising
number of operations Git and Hg use exactly the same name for the same
command.

Overloaded Commands
Ian complains about the fact that git checkout can do three
“massively different tasks”: change to a new head, revert changes to a
small number of files, and create a branch. And from a the perspective
of someone who doesn’t understand Git or version control
systems I can
see why they might look “massively different”. But, lets look at them
from the perspective of someone who does understand Git
and version
control
a little better.

First off, if you don’t know, “change to a new HEAD” is an odd way of
saying switch to a different
branch (HEAD is just a convenient pointer to the current branch). When
you grab a branch from most version control systems repositories the
operation is
referred to as “checking out” the code/branch so when you replace the
tree in your current directory with that of another branch you are
literally “checking out” the code from the repository
which is what every system does when you request the files from a
branch. But, if “checking out” a
branch was confusing for him, he could simply keep each branch in a
separate folder like most revision control systems do and just change
directories.

To cut him some more slack git allows you
to manage multiple branches within the same repo which entails managing
multiple branches in the same folder.
And yes, the power to manage totally different branches in the same
place, does require some new learning, and has some potential gotchas
if you don’t understand how it works, but note that I said “allows you
to manage” not requires.
You can live in the simpler one branch per root folder world if you
want and there’s no real penalty to doing so. Actually, I encourage new
users to work that way. It’s a simpler way to get started. And yes, to
be clear, each folder would be a different repo, but Git’s smart and
hardlinks the common objects (unless you tell it not to) to save space
and merging branches in across repos is just a matter of git pull
instead of git merge.

When you “revert changes to a small number of files” Git, like any
other version control system, gets a copy from the repository, and what
do you call the action of getting files from a repository? That’s right
kids, “checking out”. The fact that git calls it what it is instead of
giving it some misleading name like CVS’s “update” command, which
doesn’t “update” shit but instead checks out a version from the repo,
should be a point in Git’s favor not a point against it.Hg calls this operation “update” too but “checkout” is an alias for it*, so I guess they get to make it work the way the CVS / SVN people and Git people expect.

His argument about checkout enabling you to create a branch is a fair
one, but isn’t it nicer to be able to create the branch and check it
out in one command instead of two? If Ian was really confused by this
he could simply create it with “git branch
and then check it out with “git checkout “.
Personally I prefer the combined form since I’m creating them all the time.

In writing this book on Git I have repeatedly come across situations
where a command took options that made it seem to do “massively
different tasks” but when I spent the time to investigate what it was really doing I
almost always found that the problem was simply that I had been
applying
behavioral expectations from other version control systems to Git
without understanding the fundamentals of how things work in Git. Now,
helping people to understand the its fundamentals is something that
Git’s docs are not particularly good at. They tend to tell you how to
execute a command and assume you understand what’s happening (at a
conceptual level) behind the scenes. This is one of the things I’m
trying to address with the book. “git add” is a great example of this.
If you don’t really understand how Git leverages its index you’ll be
bewildered as to all those “massively different tasks” you can do with
it, but once you understand the index they all make sense and you
realize
that “add” is exactly where they should be.

Simplicity
He says that Git has lost the “simplicity argument”. But the
basics of Git are, in fact, very simple to use. If you wanted to
you could use it in almost exactly the same way you use CVS with
commands of essentially similar complexity, and CVS is a
very simple system. But, Git also offers you the power to do
very complex things. And that’s exactly how I wish every piece of
software I used was.

Let me give you an example of how simple things can be:

Operation CVS Git Hg
checkout a repo cvs co git clone hg clone
add a file cvs add git add hg add
replace a file with the version last checked in cvs update git checkout hg revert
commit changes to the repo cvs commit git commit hg commit

I could continue but you get the point. For all your basic commands
these systems are all equally simple. Yes there are complex commands
too, but there are in all version control systems.

Ian quotes a bit of Git knowledge:

The core Git filesystem can be explained as four
types of objects: Blobs (files), Trees (directories), Commits and Tags.

And then says “Unfortunately, no, it can’t. The core of Git
may well be implemented
as four kinds of things. But to get even the most basic tasks
done, you need to know repositories, working trees, branches, remotes,
masters, origins, index caches, and a bunch of other unexplained
concepts.”

With the exception of the index (or “index cache” as he calls it) all
of these things exist in Mercurial and you also need to understand the
same ones in both. Although maybe Mercurial does have something similar
to the index for staging commits, I don’t know. You could argue that
“masters” don’t exist in Hg but master is just the default name for the
initial branch in a repo. They had to call it something and they picked
“master”. Every version control system has at least one branch whether
or not they happen to mention it, and if you only use one branch per
repo you never have to even think about “master”. And if you don’t want
to name it “master” you don’t have to. SVN has the idea of remote
repositories just like Hg and Git and you don’t need to use
remote repositories in any
of them. It’s certainly helpful if you understand the concept but it’s
not as if this is some obscure thing that Git came up with and no-one
else has. To do basic CVS style operations you can be
blissfully ignorant of the index in Git. All you need to know is to use
the “-a” option when you commit to commit all your changes.

Finding Documentation
He’s totally right about not being able to look up the appropriate
command in Git’s man pages without knowing what it’s called. It is
sometimes hard to find instructions for what you’re trying to do in
Git. This is another thing I’m trying to address in the book but that’s
no help to people who don’t have the book, and no excuse for it not
being better handled by the docs, but since the book will be GPL I
fully expect some of what I write to make it onto the web (I may even
put it there myself), and maybe into the docs.

Windows
He’s also right about Git sucking on Windows, although he doesn’t put
it that way, and this is the biggest reason not to use Git. If
you need to share your codebase with Windows users Git is, IMNSHO,
simply not ready. It can be done with Cygwin but Mercurial is much less of a hassle
on Windows. There is work being done to address this, but for now Git
is primarily for Linux, Unix, and OS X.

Unreliability
He also complains about Git being unreliable but doesn’t back it up in
this post. I’ve seen no evidence to this effect or mention of any on
the web, but I’m betting that his problems with Git’s “reliability”
will be based in similar misunderstandings of how Git works. Which
would mean that Git needed to work on it’s documentation to help
improve people’s understanding.

Closing Thoughts
Ian makes a lot of complaints about how much worse Git is than Hg, but
in almost all cases if you examine the truth behind what he says you
end up with Git and Hg coming out almost exactly the same. The places
where Mercurial wins are Windows support and, probably, documentation.
Both of these are very important, but neither of them make the tool
more functional. For everyday simple usage Hg and Git are roughly
equivalent tools, from a usage standpoint (assuming you understand how
to use them), but I believe that Git is simply a more powerful tool
that lets you go much farther beyond the “everyday simple usage”. Yes,
it requires thinking differently about version control. Is that really
so bad? Most of the programming tools that can dramatically enhance
your productivity, or capabilities require getting your head around new
concepts. Git really isn’t hard, it’s just different. Think of it like
a harp. It’s simple to understand how to play and it looks like
plucking it would be the same as plucking any other stringed
instrument, but it really isn’t and you’re going to have to get used to
it because your fingers interact with it like nothing else, and you’re
not going to be able to take it everywhere either.

On a related note: Ian’s gripes are primarily founded in issues about
usability the failure of Git’s documentation to help him understand it,
but based on those criteria Darcs
kicks everyone’s ass. I don’t recommend Darcs anymore for a number of
reasons, but, with a couple exceptions, Darcs is damn easy to use
and it’s not a bad system at all.


*Thanks for the correction on Mercurial Lurker and thanks SJS for your notes about CVS.

10 Responses  
  • Ian writes:
    April 18th, 2008 at 5:45 pm

    Hi, Kate. Sorry I didn’t see this post much sooner. It’s a well-written reply, and a fair one. This long after the fact, I won’t dig into a point-by-point reply, but in general your responses can be grouped into a few main themes:

    1) Git is harder to use because it’s better — the Blub argument. That would be an argument for Git over, say, Subversion, but not so much over Mercurial. Git and Mercurial are quite similar in power. Git can do some things that would be difficult or impossible in Mercurial, but there is far more overlap than difference.

    2) Usability. It’s true that if you turn on lots of extensions in Mercurial, you can tamper more invasively with history. (I could perhaps be forgiven for having the misapprehension that git-rebase is a command intended for everyday use, since nearly every tutorial seems to mention and encourage its use.) Even so, I find that the exposure of all Git’s commands at the top level gets in the way of finding the command I need. And doing simple things like cloning into a bare repo and then doing a git-fetch should be, well, simpler.

    3) Reliability. That was an error on my part. I lost data (on a backed-up project, of course), but looking back, this seems to have been due to user traps (or pilot error, if you’re less charitable), and not in some underlying data flakiness.

    Again, I’m not seeking to re-ignite the argument. Quite the contrary — I’m going back and attempting to correct my misunderstandings. But your post deserved a response. So: thanks. Looking forward to hearing more about the book.

  • Christoph writes:
    August 15th, 2008 at 4:10 am

    Thanks for this lengthy, patient and technically very interesting reply. I found myself reading Ian’s article yesterday and though to myself “Yes – that’s exactly what annoys me with Git”. But you made a very important point in your post: Don’t expect Git to work exactly like other RCS that you might be familiar with. I actually don’t care whether Git is versioning the files in my project or more a versioned filesystem with all kinds of funky objects in the backend. But some workings are just different. And while I still don’t like the index/cache I’ll happily use “git commit -a” and perhaps later find a use case for the index. :)

    I think that Ian said Git is seen as a “framework to create your own workflow”. That sounds pretty scary. I need an RCS – not a framework either. If that would rather point out that Git is flexible but you are not forced to do that (similarly to that you are not forced to use “git rebase” unless you see a need) then it would sound much nicer. But if I start going through the tutorial and find myself not understanding even parts of the newbie tutorial then I’m severely screwed. I think “rebase” was one of the features that left me puzzled and made me keep using Mercurial because that was simple enough and didn’t force me to learn new features that I had never missed before. I think I’ll give Git another try.

  • EvanED writes:
    October 9th, 2008 at 5:06 pm

    I know this was posted a long time ago, but it shows up quite high in a Google for Mercurial vs Git (unfortunately none of the top links really have anything approaching a nice, somewhat objective comparison) and I just had to rant about a couple of my complaints on Git from the relatively little I’ve used it.

    When talking about Git’s usability Ian complains that Git installs “nearly 150 distinct binaries” and claims that “Mercurial has one.” What Ian doesn’t mention is that while Git may install nearly 150 files you only have to reference one of them directly just like Mercurial which installs over 150 files (a number which roughly doubles as each .py file begets a binary .pyc file after it gets executed).

    Yes, but while I haven’t used Mercurial, I suspect those go into some Mercurial-specific folder. Git dumps all those git-* commands into bin/.

    For my “work” machine, I don’t have root, so I have a bunch of stuff installed with a prefix of ~/.local. And I actually have a separate bin directory just for Git so that I can do an ls ~/.local/bin and not have 2/3 of the output be Git stuff.

    So if I’m right, Mercurial may indeed install 150 Python files, but it probably does it in a less obnoxious way.

    “When you “revert changes to a small number of files” Git, like any other version control system, gets a copy from the repository, and what do you call the action of getting files from a repository? That’s right kids, “checking out”.”

    No, that’s not true. For instance, take SVN. If you want to revert changes to a small number of files, you use svn revert, which simply reverts the changes. It does NOT contact the repository, at least in the SVN sense of a repository.

    This is both a minor quibble and an important difference. It’s a minor quibble because in some sense it does contact the repository — it’s just that the repository has a single version of the tree, and it’s the version you got when you last did svn update. And the repository it accesses happens to be stored in the .svn directories locally instead of what you (read: an SVN user) typically think of the repository as.

    But at the same time there is an important difference. If you do a git checkout, you get the head of the branch you specify. But if you do svn revert, you get the revision of the file that you last requested. So if I deliberately go back in time with ’svn update -r 287′ and change a file, I can ’svn revert’ and I have the file as it was in revision 287. In Git, if I deliberately go back in time with ‘git checkout 4dfa983d’, change a file, and want to revert it, from my reading I have to figure out that my working copy is based on the revision with hash 4dfa983d and then explicitly request that with the next checkout.

    In other words, if I’m working on ‘master’ and do ‘git checkout master’ to revert changes to a file, that’s like doing an ’svn revert’ then ’svn update’.

  • Мысли вслух: Git или Mercurial » Outsourcing stories writes:
    December 11th, 2008 at 12:26 am

    [...] A rebuttal to “Use Mercurial You Git” add to del.icio.us « Почему консультанта найти проще, чем человека в штат | [...]

  • Matt W writes:
    March 4th, 2009 at 8:34 am

    EvanED is just plain wrong about ‘git checkout’. In actuality, ‘git checkout <path>’ replaces a file in the working copy with NEITHER the version from the tip of the branch NOR the version checked out when you went back in time. (And actually, when you check out a commit by SHA-1 hash, you’re switching to an unnamed branch, so these two would be equivalent, but neither is what happens.)

    What ‘git checkout’ *actually* does (RTFM) when given a path is to replace the working copy file with the version of that file currently in the _index_. If you haven’t modified the version in the index since you switched branches, then you’ll get the same result as ’svn revert’. To achieve the same result as ’svn revert’ in all situations, you would first need to ‘git reset’ the file to replace the version in the index with the version from the current HEAD (which would be the commit to which you went “back in time”) and then ‘git checkout’ to replace the file in your working copy with the version in the index.

    Git makes a lot more sense once you realize that data _never_ moves between the working copy and the repository directly — in either direction. It always goes through the index: ‘git reset’ copies from the repository to the index, ‘git checkout’ copies from the index to the working copy, ‘git add’ copies from the working copy to the index, and ‘git commit’ copies from the index to the repository.

    It might also be worth mentioning that ‘git reset –hard <commit>’ and ‘git checkout -f <commit>’ do almost the same thing. Specifically, they both: update the ‘HEAD’ ref to point at the specified commit, reset the index to the contents of that commit, and check out the index into the working tree. The difference is that ‘reset’ also updates the ref of the branch that you were on, if any.

  • chiguy writes:
    June 7th, 2009 at 8:37 pm

    I don’t see why people say Git doesn’t work well on Windows. I use it on Windows and I’m perfectly happy with it, much happier than when I was using SVN (with tortoise).

  • masukomi writes:
    June 7th, 2009 at 9:09 pm

    Because git originally git didn’t work at all on windows except via cygwin which doesn’t really count. Then other tools started to emerge. So, whenever you read someone claiming it doesn’t work, or work well, on windows, check the date. It was probably written back when this was true.

  • Jack writes:
    September 1st, 2009 at 1:27 pm

    What you want for Windows is msysgit from here http://code.google.com/p/msysgit/downloads/list (choose “Full installer if you want to use official Git…”). It works fine – comes with a bash shell so you can use git exactly as you do in Unix/Linux.

  • Jack writes:
    September 1st, 2009 at 1:27 pm

    Mat W, your explanation of reset, checkout, add, and commit should be the first paragraph on every page of the git manual!

    It seems most people understand add and commit, but few understand reset and checkout. For example, the author of “Pro Git” (http://progit.org/book/ch1-3.html) explains checkout incorrectly (as from repository to working) in Figure 1-6.

    I think the command names are a major factor in the widespread misunderstanding. The word checkout implies getting a fresh copy of the last version checked in. You don’t need experience using CVS to have that impression. Experience using a library will do.

    I don’t think I have ever seen reset explained as you have explained it. It is pretty hard to come up with that understanding from the man page: “git-reset: Reset current HEAD to the specified state. Description: Sets the current head to the specified commit and optionally resets the index and working tree to match.” That is misleading given that the default behavior is to reset the index, with an option to also reset the working tree (–hard), or reset neither (–soft).

    Here is one series of events that is totally confusing if you don’t understand the git index:

    1) modify a file
    2) add it to the index in preparation for a commit
    3) change your mind about committing the change
    4) now try to revert to the original version of the file by checking it out again

    Checking out again will not “revert” the changes, because the changes are already in the index and checkout causes working to match index. You have to reset index from repository, then checkout from index to working. If you dig around, you’ll find there are commands that do both (reset –hard, and commit -f) but to the uninitiated, those just make the git index harder to understand.

  • Jack writes:
    September 1st, 2009 at 5:18 pm

    In the last sentence above, I meant to say “checkout -f” not “commit -f”.


Leave a Reply

»  Substance: WordPress   »  Style: Ahren Ahimsa
© Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.