So to be clear about what this is (because the website doesn’t quite clarify): this collision lets you pick two different prefixes P1, P2, then calculates some pseudorandom data C1, C2 such that SHA1(P1+C1) = SHA1(P2+C2). The length extension property of SHA1 (and MD5) means that now SHA1(P1+C1+X) = SHA1(P2+C2+X) for any X.<p>A similar attack (which requires only a few hours on modest hardware nowadays) has been known for a long time for MD5, but this is the first time it’s been demonstrated for SHA-1.<p>The previous attack, called Shattered (<a href="https://shattered.io" rel="nofollow">https://shattered.io</a>) was a regular collision, that is, they chose a single prefix P and found different C1, C2 such that SHA1(P+C1) = SHA1(P+C2). This can also be length extended, so that SHA1(P+C1+X) = SHA1(P+C2+X). However, this attack is more limited because there is little to no control over the pseudorandom C1 and C2 (the only differing parts of the messages).<p>With a chosen prefix collision, though, things are way worse. Now you can create two documents that are arbitrarily different, pad them to the same length, and tack on some extra blocks to make them collide.<p>Luckily, the first collision should have already warned people to get off of SHA1. It’s no longer safe to use for many applications. (Note, generally for basic integrity operations it might be OK since there’s no preimage attack, but I’d still be a bit wary myself).
> We note that classical collisions and chosen-prefix collisions do not threaten all usages of SHA-1. In particular, HMAC-SHA-1 seems relatively safe, and preimage resistance (aka ability to invert the hash function) of SHA-1 remains unbroken as of today.<p>Nice to see this bit of intellectual honesty. Would be even nicer if they had explained what that means in terms of PGP keys.
This kind of thing always brings me down a bit. It's not rational, but it does.<p>I mean I truly admire these folks skills, the math involved is obviously remarkable.<p>But I think the feeling is related to not being able to rely on anything in our field. Hard to justify going to the trouble of encrypting your backup. 10 years from now, it might be as good as plain text.<p>It's not security only, nothing seems to work in the long term.
Imagine an engineer receiving a call at midnight about his bridge because gravity changed during daylight saving in a leap year. That's our field.
General questions:<p>(edit: these are indeed general questions, not just about SHA1)<p>Has anyone else been worried about data deduplication done by storage and/or backup systems, considering that they usually use hashes to detect data blocks that are "the same" (without additional metadata) and avoid storing those "duplicate data blocks" again? Doesn't this seem far worse when you also consider that systems like Dropbox deduplicate data across all their users (expanding the footprint for collisions)? Are there any research papers/articles/investigations about this?
Just a curiosity, since people are talking about Git still using SHA-1 (despite work on SHA-256 since 2017).<p>I see that Git doesn't actually use SHA-1 any more, it uses "hardened SHA-1": <a href="https://stackoverflow.com/questions/10434326/hash-collision-in-git/43355918#43355918" rel="nofollow">https://stackoverflow.com/questions/10434326/hash-collision-...</a>
> SHA-1 has been broken for 15 years, so there is no good reason to use this hash function in modern security software.<p>Why are cryptographers always exaggerating things and so out of touch with reality? The first actual collision was like 3 years ago. It's not like the world has been on fire in the meantime, and it's not like SHA-1 is broken for every single possible usage even now. And why the nonsense with "no good reason"? Obviously performance is one significant consideration for the unbroken use cases. Do they think painting a different reality than the one we live in somehow makes their case more compelling?
><i>A countermeasure has been implemented in commit edc36f5, included in GnuPG version 2.2.18 (released on the 25th of November 2019): SHA-1-based identity signatures created after 2019-01-19 are now considered invalid.</i><p>Since SHA-1 was always possible to break, and since NSA probably gets access to big computers and sophisticated techniques before researchers, why doesn't this invalidate every SHA-1 signature ever made and not just ones from last year?
Quick question about the "What should I do" section. It says "<i>use instead SHA-256</i>". Isn't SHA-512 both better and faster on modern hardware?
Out of curiosity, can anyone explain in layman's terms the differences in design that make SHA-1's successors immune to the known attacks against SHA-1? Ultimately was this the result of an apparent flaw in SHA-1 that only became obvious in retrospect, or was it something totally unforeseeable?
> security level 2 (defined as 112-bit security) in the latest release (Debian Buster); this already prevents dangerous usage of SHA-1<p>FWIW this doesn't apply to Fedora currently, because it has a patch that re-enables SHA-1 in security level 2 in non-FIPS mode: <a href="https://src.fedoraproject.org/rpms/openssl/blob/master/f/openssl-1.1.1-seclevel.patch" rel="nofollow">https://src.fedoraproject.org/rpms/openssl/blob/master/f/ope...</a>
So how would someone go about gaining more than 45k USD in profit from a single case of using the chosen-prefix collision?
Not being candid here, I am honestly curious here. I'd guess that even in situations where you somehow get a signed e-mail sent off spoofing a CEO saying "Please pay these guys 50k$" the actual payout seems unlikely and that puts the attacker 45k in the red. But maybe there are some obvious avenues of abuse that I'm missing, or is this more a case of "In a decade it will become economical to abuse this for profit"?
The full paper is <a href="https://eprint.iacr.org/2020/014.pdf" rel="nofollow">https://eprint.iacr.org/2020/014.pdf</a> if anyone is interested
Let's say that you know that someone stores documents by SHA, and silently overwrites collisions. Is there any way this would help to deceive them after being forced to give them your data? It seems like once the data is out of your control, you can't match an existing SHA, and if you created a pair of documents that match SHAs, you can't predict which one will be overwritten.
Link to the GnuPG commit:<p><a href="https://dev.gnupg.org/rGedc36f59fcfc" rel="nofollow">https://dev.gnupg.org/rGedc36f59fcfc</a>
The root certificate authority for my company's Active Directory is signed using a sha1 hash. What are the practical implications of this chosen collision?<p>How do I convince my IT department to update our CA to sha256?
> Responsible Disclosure<p>We have tried to contact the authors of affected software before announcing this attack, but due to limited resources, we could not notify everyone.<p>Is there a list of affected software out there?
Does this affect Git? I believe it uses SHA-1 for commits. Is it possible to use this attack to add malicious code to a git repository without changing the hashes for the commits?
While a meaningful accomplishment, suggesting the algorithm is in a "shambles" seems hyperbolic to me. For one thing there's a non-trivial practical leap between formulating two colliding identities and forging an existing one, and for another this was only modestly better than a pure brute force attack. If anything I'm somewhat reassured by the idea that it still costs $40,000+ of GPU time to pull something like this off while doing the same with MD5 is feasible on a mobile phone.
I assume there was a lot of work (read money) put in those collision attacks rather than it being discovered by accident. I'm wondering who is sponsoring this work and for what purpose?
The argument about proving that an algorithm is broken and working on better cryptography wouldn't suffice in this case, as issues were shown before that. Here the purpose was to make the attack cheaper?
>Can I try it out for myself?
Since our attack on SHA-1 has pratical implications, in order to make sure proper countermeasures have been pushed we will wait for some time before releasing source code that allows to generate SHA-1 chosen-prefix collisions.<p>Sigh. Again with this idiocy. All instances where the adversary is capable of launching this attack financially mean they also have the capability to write the exploit themselves.
> By renting a GPU cluster online, the entire chosen-prefix collision attack on SHA-1 costed us about 75k USD.<p>So they just decided to try their attack and spend two years worth of salary on it?? That's crazy.