There’s a good deal of confusion about Git, Ian’s Use
Mercurial You Git article is a good example of it which I’d
like to address point by point. But first, I’d like to say that I’m
giving Ian the benefit of the doubt. I don’t think he’s intentionally
trying to mislead people I think he simply doesn’t understand Git very
well, and that’s not his fault. Git has two problems that face new
users. It suffers from the Blub Paradox
and the documentation tends to assume that you already “get” how
radically different Git is. So people end up applying the assumptions
of how other version control systems to how Git and end up becoming
confused and frustrated like Ian, because while some of the commands
look similar, what Git is doing and how it does it are dramatically
different from other systems, but that’s a good thing. It’s what makes
Git so much better than the competition. It’s kind of like how Struts
(Java web framework) and Ruby On Rails (Ruby web framework) are both
doing essentially the same thing and processing the same parameters
from a web server, but if you try and program a Rails app like you
program a Struts app you’re in for a world of hurt.
The basic version control concepts are the same in Git, but at the same
time, they’re not. Most version control systems are concerned with
managing files, or metadata about files. Git doesn’t even care about files. Yes, it seems like it does,
and you can even use it as
if it does but it really doesn’t. The key for me
to really buying this statement was when I realized that, if I
really wanted to, Git would let me pick and choose between the various
hunks of a file and only add the ones I wanted to the next commit.
My advice to new users is simple. To get started, act as if it’s like
the version control system you’re familiar with, but as soon as
something doesn’t work the way you’d expect take the time to learn what
is really happening and why Git is doing it. Every single time this has
happened for me I have obtained another “aha!” moment which gave me a
significantly better understand of Git as a whole and showed me how
truly awesome git really is.
So, with that said, let’s look at Ian’s complaints:
Installed Files
When talking about Git’s usability Ian complains that Git installs
“nearly 150 distinct binaries” and claims that “Mercurial has one.”
What Ian doesn’t mention is that while Git may install nearly 150 files
you only have to reference one of them directly just like Mercurial
which installs over 150 files (a number which roughly doubles as each
.py
file begets a binary .pyc file after it gets executed). But, this is a
straw-man argument because it doesn’t matter how many files an
application installs, and it
doesn’t matter if they’re binary or not. What’s a much more important
measure is how many you have to interact with directly and in
both
cases the answer is one. The confusion is probably a result of the fact
that Git actually allows you to call its component parts directly if
you really want to, whereas Hg does not. Another
good example of this is Dreamweaver, which installs a metric fuckload
of files, including a little JavaScript file for almost every one of
the cool pieces of functionality. But you never hear newbish web
developers complaining about it because they never have to interact
with them directly. However, power users love the fact that they can work with them
directly (and change how Dreamweaver works).
Usability
Ian uses Git’s rebase command as an example of why Git’s usability is
bad. Rebase is one of the most dramatically powerful tools in the Git
arsenal, it’s also something you never
have to use if you don’t want to. It allows you to reorder,
merge, and
exclude any / all of the commits in your history. But guess what?
Ian doesn’t mention the fact that Mercurial has a similar
functionality called
Transplant which is
roughly as easy to get confused about using as Git’s rebase
functionality. Of course you don’t have to ever use it in Mercurial
either, as evidenced by his not knowing about it. I’m assuming he
wasn’t intentionally trying to make Git look bad by ignoring
Transplant. The fact that the Rubinius project he was looking at
requests that committers use it for it’s valuable housekeeping aspects
is irrelvant to the discussion of Git’s usability.
He then goes on to say that “For day-to-day use of Mercurial, you only
need hg fetch to get code, and hg commit to give code.” except commit
only works on a local repo and doesn’t “give” code to anyone, and my
fairly
recent copy of Hg only has a “pull” command but I do see mention in
Google
that fetch is “…only available when you enable hgext.fetch
extension…”. So,
what he should have said (unless we want to include non-default
commands) is that for day-to-day use of Mercurial, you
only need hg pull && hg up && hg merge
&& hg commit to get code and hg push to give code.” If
we cut him some slack and replace all that garbage with hg fetch
and compare it to Git where you need git pull to get code and
git push
to give code you’ll see that the two are almost identical. In fact, for
a surprising
number of operations Git and Hg use exactly the same name for the same
command.
Overloaded Commands
Ian complains about the fact that git checkout can do three
“massively different tasks”: change to a new head, revert changes to a
small number of files, and create a branch. And from a the perspective
of someone who doesn’t understand Git or version control
systems I can
see why they might look “massively different”. But, lets look at them
from the perspective of someone who does understand Git
and version
control
a little better.
First off, if you don’t know, “change to a new HEAD” is an odd way of
saying switch to a different
branch (HEAD is just a convenient pointer to the current branch). When
you grab a branch from most version control systems repositories the
operation is
referred to as “checking out” the code/branch so when you replace the
tree in your current directory with that of another branch you are
literally “checking out” the code from the repository
which is what every system does when you request the files from a
branch. But, if “checking out” a
branch was confusing for him, he could simply keep each branch in a
separate folder like most revision control systems do and just change
directories.
To cut him some more slack git allows you
to manage multiple branches within the same repo which entails managing
multiple branches in the same folder.
And yes, the power to manage totally different branches in the same
place, does require some new learning, and has some potential gotchas
if you don’t understand how it works, but note that I said “allows you
to manage” not requires.
You can live in the simpler one branch per root folder world if you
want and there’s no real penalty to doing so. Actually, I encourage new
users to work that way. It’s a simpler way to get started. And yes, to
be clear, each folder would be a different repo, but Git’s smart and
hardlinks the common objects (unless you tell it not to) to save space
and merging branches in across repos is just a matter of git pull
instead of git merge.
When you “revert changes to a small number of files” Git, like any
other version control system, gets a copy from the repository, and what
do you call the action of getting files from a repository? That’s right
kids, “checking out”. The fact that git calls it what it is instead of
giving it some misleading name like CVS’s “update” command, which
doesn’t “update” shit but instead checks out a version from the repo,
should be a point in Git’s favor not a point against it.Hg calls this operation “update” too but “checkout” is an alias for it*, so I guess they get to make it work the way the CVS / SVN people and Git people expect.
His argument about checkout enabling you to create a branch is a fair
one, but isn’t it nicer to be able to create the branch and check it
out in one command instead of two? If Ian was really confused by this
he could simply create it with “git branch “
and then check it out with “git checkout “.
Personally I prefer the combined form since I’m creating them all the time.
In writing this book on Git I have repeatedly come across situations
where a command took options that made it seem to do “massively
different tasks” but when I spent the time to investigate what it was really doing I
almost always found that the problem was simply that I had been
applying
behavioral expectations from other version control systems to Git
without understanding the fundamentals of how things work in Git. Now,
helping people to understand the its fundamentals is something that
Git’s docs are not particularly good at. They tend to tell you how to
execute a command and assume you understand what’s happening (at a
conceptual level) behind the scenes. This is one of the things I’m
trying to address with the book. “git add” is a great example of this.
If you don’t really understand how Git leverages its index you’ll be
bewildered as to all those “massively different tasks” you can do with
it, but once you understand the index they all make sense and you
realize
that “add” is exactly where they should be.
Simplicity
He says that Git has lost the “simplicity argument”. But the
basics of Git are, in fact, very simple to use. If you wanted to
you could use it in almost exactly the same way you use CVS with
commands of essentially similar complexity, and CVS is a
very simple system. But, Git also offers you the power to do
very complex things. And that’s exactly how I wish every piece of
software I used was.
Let me give you an example of how simple things can be:
| Operation |
CVS |
Git |
Hg |
| checkout a repo |
cvs co |
git clone |
hg clone |
| add a file |
cvs add |
git add |
hg add |
| replace a file with the version last checked in |
cvs update |
git checkout |
hg revert |
| commit changes to the repo |
cvs commit |
git commit |
hg commit |
I could continue but you get the point. For all your basic commands
these systems are all equally simple. Yes there are complex commands
too, but there are in all version control systems.
Ian quotes a bit of Git knowledge:
The core Git filesystem can be explained as four
types of objects: Blobs (files), Trees (directories), Commits and Tags.
And then says “Unfortunately, no, it can’t. The core of Git
may well be implemented
as four kinds of things. But to get even the most basic tasks
done, you need to know repositories, working trees, branches, remotes,
masters, origins, index caches, and a bunch of other unexplained
concepts.”
With the exception of the index (or “index cache” as he calls it) all
of these things exist in Mercurial and you also need to understand the
same ones in both. Although maybe Mercurial does have something similar
to the index for staging commits, I don’t know. You could argue that
“masters” don’t exist in Hg but master is just the default name for the
initial branch in a repo. They had to call it something and they picked
“master”. Every version control system has at least one branch whether
or not they happen to mention it, and if you only use one branch per
repo you never have to even think about “master”. And if you don’t want
to name it “master” you don’t have to. SVN has the idea of remote
repositories just like Hg and Git and you don’t need to use
remote repositories in any
of them. It’s certainly helpful if you understand the concept but it’s
not as if this is some obscure thing that Git came up with and no-one
else has. To do basic CVS style operations you can be
blissfully ignorant of the index in Git. All you need to know is to use
the “-a” option when you commit to commit all your changes.
Finding Documentation
He’s totally right about not being able to look up the appropriate
command in Git’s man pages without knowing what it’s called. It is
sometimes hard to find instructions for what you’re trying to do in
Git. This is another thing I’m trying to address in the book but that’s
no help to people who don’t have the book, and no excuse for it not
being better handled by the docs, but since the book will be GPL I
fully expect some of what I write to make it onto the web (I may even
put it there myself), and maybe into the docs.
Windows
He’s also right about Git sucking on Windows, although he doesn’t put
it that way, and this is the biggest reason not to use Git. If
you need to share your codebase with Windows users Git is, IMNSHO,
simply not ready. It can be done with Cygwin but Mercurial is much less of a hassle
on Windows. There is work being done to address this, but for now Git
is primarily for Linux, Unix, and OS X.
Unreliability
He also complains about Git being unreliable but doesn’t back it up in
this post. I’ve seen no evidence to this effect or mention of any on
the web, but I’m betting that his problems with Git’s “reliability”
will be based in similar misunderstandings of how Git works. Which
would mean that Git needed to work on it’s documentation to help
improve people’s understanding.
Closing Thoughts
Ian makes a lot of complaints about how much worse Git is than Hg, but
in almost all cases if you examine the truth behind what he says you
end up with Git and Hg coming out almost exactly the same. The places
where Mercurial wins are Windows support and, probably, documentation.
Both of these are very important, but neither of them make the tool
more functional. For everyday simple usage Hg and Git are roughly
equivalent tools, from a usage standpoint (assuming you understand how
to use them), but I believe that Git is simply a more powerful tool
that lets you go much farther beyond the “everyday simple usage”. Yes,
it requires thinking differently about version control. Is that really
so bad? Most of the programming tools that can dramatically enhance
your productivity, or capabilities require getting your head around new
concepts. Git really isn’t hard, it’s just different. Think of it like
a harp. It’s simple to understand how to play and it looks like
plucking it would be the same as plucking any other stringed
instrument, but it really isn’t and you’re going to have to get used to
it because your fingers interact with it like nothing else, and you’re
not going to be able to take it everywhere either.
On a related note: Ian’s gripes are primarily founded in issues about
usability the failure of Git’s documentation to help him understand it,
but based on those criteria Darcs
kicks everyone’s ass. I don’t recommend Darcs anymore for a number of
reasons, but, with a couple exceptions, Darcs is damn easy to use
and it’s not a bad system at all.
*Thanks for the correction on Mercurial Lurker and thanks SJS for your notes about CVS.