TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Hash collisions and exploitations – Instant MD5 collision

167 pointsby losfairover 2 years ago

13 comments

Dweditover 2 years ago
For anyone out there still using MD5 for any reason, check out this PDF file: <a href="https:&#x2F;&#x2F;www.alchemistowl.org&#x2F;pocorgtfo&#x2F;pocorgtfo14.pdf" rel="nofollow">https:&#x2F;&#x2F;www.alchemistowl.org&#x2F;pocorgtfo&#x2F;pocorgtfo14.pdf</a> (42MB). You can also rename it to a .NES file and run it in a NES emulator.<p>It&#x27;s a PDF File which is also a NES ROM that displays its own MD5 sum. The PDF also shows its own MD5 sum a few times. (The MD5 sum also happens to begin with 5EAF00D)<p>When an arbitrary MD5 can be created that easily, it&#x27;s useless for any cryptographic applications, or even any data integrity.
评论 #32917498 未加载
评论 #32920658 未加载
评论 #32923725 未加载
jrootabegaover 2 years ago
Disclaimer: This is a fun thought experiment. I&#x27;m not looking for actionable results, or advocating for relying on any of this comment for actual security. I&#x27;m clearly not a cryptographer; I just think it would be interesting to talk about here, and maybe more educated people could comment on how well these approaches might mitigate the exploits in the article. Play with me in this space.<p>I&#x27;m curious if people have any interesting ideas on how to add some seasoning to MD5 to make it more secure. That is, simple, intuitive things you can do in combination with MD5 such that all the pieces in your scheme are still easily understood and don&#x27;t amount to a new hash algorithm that can only be understood as a black box. Pretend MD5 is the only hash algorithm that has ever been found. Or that you&#x27;re the Gilligan&#x27;s Island Professor and MD5 hashes are your coconuts. What are the most potentially useful things you can build out of the most primitive, dumb components?<p>For example:<p>- Output the length of the input (or a hash of the length if you must have a constant-length output)<p>- Hash the input forwards and backwards and produce two hashes. (Remembering that, though the output is 256 bits now, you still only have coconuts to work with.)<p>- Include more complicated variations on the input in the hashes. e.g. start in the middle and oscillate forward and backward over the input, or move the second half of the input in front of the first before hashing, or use the input&#x2F;hash of the input to seed a pseudorandom re-ordering of the input before hashing, etc.<p>- Format-aware hashing - whatever program will interpret the content of the file can also produce a hash, or some [canonical] interpretation of the content that can be hashed. e.g., for an image format, we could ask the renderer how many iterations of some operation it had to perform to render the output, or in the worst case, hash the bitmap it produced.
评论 #32918096 未加载
评论 #32912443 未加载
评论 #32912227 未加载
评论 #32912524 未加载
评论 #32916600 未加载
评论 #32916301 未加载
评论 #32920742 未加载
评论 #32919954 未加载
评论 #32912983 未加载
omoikaneover 2 years ago
See also: &quot;Lifetimes of cryptographic hash functions&quot; - <a href="https:&#x2F;&#x2F;valerieaurora.org&#x2F;hash.html" rel="nofollow">https:&#x2F;&#x2F;valerieaurora.org&#x2F;hash.html</a><p>MD5 appears to be firmly in the &quot;fun party trick&quot; stage.
评论 #32915953 未加载
评论 #32916489 未加载
londons_exploreover 2 years ago
Question for people into cryptography + data archiving....<p>If I want to store data for 500 years, I want future people to be reasonably sure of the integrity of the data, both against &#x27;bit rot&#x27;, but also deliberate tampering.<p>Is the best available approach to hash the data with a bunch of hash algorithms and publish all the hashes?<p>Then if <i>any</i> hash algorithm remains unbroken, the integrity of my data is certainly still good. An attacker would have to do a simultaneous preimage attack for <i>every</i> hash algorithm I choose to break the scheme, which historically has never happened to my knowledge.
评论 #32912052 未加载
评论 #32912992 未加载
评论 #32911613 未加载
评论 #32912051 未加载
评论 #32911609 未加载
评论 #32912061 未加载
评论 #32918188 未加载
评论 #32917765 未加载
评论 #32911763 未加载
评论 #32919359 未加载
评论 #32916229 未加载
评论 #32912537 未加载
评论 #32911653 未加载
评论 #32916699 未加载
评论 #32919572 未加载
omkover 2 years ago
Have always thought of creating email addresses with colliding MD5 hashes only to see how Gravatar&#x27;s MD5 URL reacts to it.
EGregover 2 years ago
I asked a while ago, whether it’s feasible to get another file to generate a given hash.<p>The answer is no. Not even with MD5.<p>Just be very sure that this is the guarantee you are looking for. Often, for Merkle Trees etc. that is EXACTLY what is needed.<p>Can someone craft input files (eg images) to fool your system? Yes, but only at their own expense.<p>Sometimes if you want the system to be resilient even in the fact of malicious inputs then yes, you should use SHA256 and higher.
评论 #32918363 未加载
a-dubover 2 years ago
in the era of high fidelity generative models, i suspect that the future of media formats will be security forward with built-in protections against length extension attacks.<p>i&#x27;m having a hard time imagining any other future than one where people only trust signed media, and media is possibly even signed in hardware by actual physical sensors&#x2F;compressors.
abetuskover 2 years ago
From the README:<p>&quot;&quot;&quot;<p>Colliding any pair of files has been possible for many years, but it takes several hours each time, with no shortcut. This page provide tricks specific to file formats and precomputed collision prefixes to make collision instant. git clone. Run Script. Done.<p>&quot;&quot;&quot;<p>Could anyone weigh in on whether these ideas can be generalized to speed up MD5 collisions in general?
评论 #32913300 未加载
rurbanover 2 years ago
Ad exploitations:<p>I&#x27;ve added some inverses of hash functions here: <a href="https:&#x2F;&#x2F;github.com&#x2F;rurban&#x2F;smhasher&#x2F;tree&#x2F;master&#x2F;inverse" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;rurban&#x2F;smhasher&#x2F;tree&#x2F;master&#x2F;inverse</a>
westurnerover 2 years ago
MD5 &gt; History, Security &gt; Collision vulnerabilities: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;MD5#Collision_vulnerabilities" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;MD5#Collision_vulnerabilities</a>
1970-01-01over 2 years ago
The most interesting part was collision with certificates. Platinum ticket for malware.
dekhnover 2 years ago
Side note: I recently saw an example code using tensorflow to determine the private key of some cryptosystem. I can&#x27;t find it. It was literally operating on the bits of the key and somehow had a loss function. Any ideas?
评论 #32918143 未加载
retrocryptidover 2 years ago
And yet MD5 is still recommended in Applied Cryptography.
评论 #32910992 未加载