OK, this is quite a serious vulnerability in Subversion. SVN depends more on raw file SHA1 hashes than git because git prepends a header which prevents raw SHA1 collisions from translating directly into easy svn-style repository corruption.<p>The reason svn is broken is its "rep-sharing" feature, i.e. file content deduplication. It uses a SQLite database to share the representation of files based on their raw SHA1 checksum - for details see <a href="http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure" rel="nofollow">http://svn.apache.org/repos/asf/subversion/trunk/subversion/...</a><p>You can mitigate this vulnerability by setting enable-rep-sharing = false in fsfs.conf - see documentation in that file or in the source at <a href="http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c?revision=1737356&view=markup#l862" rel="nofollow">http://svn.apache.org/viewvc/subversion/trunk/subversion/lib...</a><p>This feature was introduced in svn 1.6 released 2009, and made more aggressive in svn 1.8 released 2013 <a href="https://subversion.apache.org/docs/release-notes/" rel="nofollow">https://subversion.apache.org/docs/release-notes/</a><p>SVN exposes the SHA1 checksum as part of its external API, but its deduplication could easily have been built on a more secure foundation. Their decision to double down on SHA1 in 2013 was foolish.
As mentioned in a previous comment ( <a href="https://news.ycombinator.com/item?id=13722469" rel="nofollow">https://news.ycombinator.com/item?id=13722469</a> ) git doesn't see these the same as it hashes the header+content which breaks the identical SHA trick.<p>Of course, I first tested this on our main production repository at work because...oh, wait, I didn't because <i>what were you thinking</i>?!
(from the link) "For the record: the commits have been deleted, but the SVN is still hosed." That is pretty much my memory of working with SVN. I remember SVN fouling its database a few times. Sure I've broken git a few times, but I am always able to (as Jenny Bryan says) "burn the whole thing down" and take state from another copy of the repository.<p>I really tried with SVN (wanted something better than CVS) for quite a long time.
Reminds me of when I worked at an antivirus company. We had be careful with the EICAR file in test code because it would set off AV alarms. <a href="http://www.eicar.org/86-0-Intended-use.html" rel="nofollow">http://www.eicar.org/86-0-Intended-use.html</a>
A bit hard for me to tell what happened here, maybe because I don't know anything about SVN. The two PDFs with equal SHA1 hashes were git commited to the repository, but converting that to an SVN commit failed because... SVN can't handle two separate files with the same SHA1 hash?
I have to just say here that WebKit is one of the most over-the-top software projects I've ever tried to dig into, in my twenty years of programming. Building it inside a vanilla container was impossible following their directions exactly and required <i>so much</i> research on my part to get working. I'm used to a bit of back-and-forth with just about every project, but WebKit was ridiculous. After two workdays of trying, I'd been able to build a WebKit from the source, but at that point had to concede to the universe the futility of trying to build a golang-based Phantom, as my friend and former coworker originally wanted. And that also gave me <i>mad</i> respect for Phantom's author and immediately taught me why they do not often incorporate new WebKit versions into the project instead of just pegging to the first one they can get to build.