This could actually give Mercurial a big edge over Git for development environments where large binary files are a core part of your workflow - like game development. Products like Perforce are a big hit in games precisely because they are really good at handling this specific class of file.<p>It's a shame, because I hate using Mercurial, but this would give me a very strong reason to use it for my game projects instead of Git.
I'm the developer of [git-annex](<a href="http://git-annex.branchable.com/" rel="nofollow">http://git-annex.branchable.com/</a>) which is AFAIK the closest eqivalant for git. I only learned about the mercurial bfiles extension (which became the large files extension) after designing git-annex.<p>The designs are obviously similar at a high level, but one important difference is that git-annex tracks, in a fully distributed manner, which git repositories currently contain the content of a particular large file. The mercurial extension is, AFAIK, rather more centralized; while it can transfer large file content from multiple stores it can't, for example, transfer a large file from a nearby client that happens to currently have a copy, which git-annex can do (if a remote is set up). This location tracking also allows me to have offline archival disks whose content is tracked with git-annex. If I ask for an archived file, git-annex knows which disks I can put online to retrieve it.<p>Another difference is that the mercurial extension always makes available <i>all</i> the large files for the currently checked out tree. git-annex allows a tree to be checked out with large files not present (they appear as broken symlinks); you can ask it to populate the tree and it retrieves the files as a separate step. This is both more complex and more flexible. For example, I have a git repository containing a few terabytes of data. It's checked out on my laptop's 30 gb SSD. Only the files I'm currently using are present on my laptop, but I can still <i>manage</i> all the other files, reorganizing them, requesting ones I need, etc.<p>git-annex also has support for special remotes, which are not git repositories, but in which large files are stored. So large files can be stored in Amazon S3 (or the Internet Archive S3), in a bup repository, or downloaded from arbitrary urls on the web.<p>Content in special remotes are tracked the same as other remotes. This lets me do things like this (the first file is one of my Grandfather's engineering drawings of Panama Canal locks):<p><pre><code> joey@gnu:~/lib/big/raw/eckberg_panama>git annex whereis img-0124.png
whereis img-0124.png (5 copies)
5863d8c0-d9a9-11df-adb2-af51e6559a49 -- turtle (turtle internal drive)
7e55d8d0-81ab-11e0-acc9-bfb671110037 -- archive-panama (internet archive http://www.archive.org/details/panama-canal-lock-design-papers)
905a3a64-4149-11e0-8b3f-97b9501cdcd3 -- passport (passport usb drive 1 terabyte)
9b22e786-dff4-11df-8b4c-731a6178061c -- archive-leech (archive-6 sata drive)
f4c185e2-da3e-11df-a198-e70f2c123f40 -- archive (archive-5 sata drive)
ok
joey@gnu:~/lib/big/raw/eckberg_panama>git annex get img-0124.png --from archive-panama
get img-0124.png (from archive-panama...) ok
</code></pre>
I'm hopeful that git will grow some internal hooks for managing large files that will improve git-annex and also allow others to develop extensions that, perhaps, behave more like the mercurial largefiles extension. I recently attended the GitTogether and this stuff was a major topic of discussion.
Another option available since Mercurial 1.5 is to put the large files in a subversion repository and reference it as a subrepository.<p><a href="http://mercurial.selenic.com/wiki/Subrepository#SVN_subrepositories" rel="nofollow">http://mercurial.selenic.com/wiki/Subrepository#SVN_subrepos...</a>
/me wonders when Mercurial will ever do anything other than copy BitKeeper.
We've been doing this for years, my photos are in a ~100GB BK/BAM repo.<p>Release notes for BitKeeper version 4.1 (released 12-Oct-2007)<p>Major features<p>BAM support. BAM stands for "Binary Asset Management" and it adds
support to BK for versioning large binaries. It solves two problems:
a) one or more binary files that are frequently changed.
b) collections of many large binaries where you only need a subset.
The way it solves this is to introduce the concept of BAM server[s].
A BAM server manages a collection of binaries for one or more BAM
clients. BAM clients may have no data present; when it is needed
the data is fetched from the BAM server.<p>In the first case above, only the tip will be fetched. Imagine that
you have 100 deltas, each 10MB in size. The history is 1GB but you
only need 10MB in your clone.<p>In the second case, imagine that you have thousands of game assets
distributed across multiple directories. You typically work only
in one directory at a time. You will only need to fetch the subset
of files that you need, the rest of the repository will have the
history of what changed but no data (so bk log will work but
bk cat will have to go fetch the data).