Why you should never use hash functions for message authentication

171 pointsby bakkdooralmost 13 years ago

19 comments

cbsmithalmost 13 years ago

This is a great essay on why you should never use a hash function for message authentication.Except not for the reason the author thinks.There are several problems here.First, with SHA-1 for example, you have 64 bytes per chunk. That means you basically get a free ride on this problem for anything < 64 bytes. A lot of "application state" fits pretty well in 64 bytes.Secondly, unless a message ends right on the 64 byte boundary, it is not nearly that simple. You have a bit of a problem, because the hash is padded, and when you add extra characters to your original string, that padding gets replaced with those values. So, it's no longer simple to just "keep going" from where you stopped.Still, you can see how that leaves a distinct subset of cases where you'd be exposed. SHA-1, along with most secure hash functions, appends the length of the message the end of the source text before performing the hash function. That means that if you add even one byte to the string, you have now changed the last 8 bytes that were fed in to the "original" hash function. Oh, and your extra byte goes in before those bytes, so not only did you change those 8 bytes, but you shifted them down a byte.So, no, it isn't nearly that easy to crack a SHA-1 based authentication, and yet, it is easy enough that you should totally NOT use them for authentication and instead use HMAC ; they are vulnerable to extension attacks, it's just not nearly as easy as this article suggests, and conclusions one might draw from this article (like you can solve this problem by feeding all source text in to the hash algorithm backwards) are likely ill founded.It just turns out that cryptography is way more complicated, and even in terms of understanding weaknesses that arise from doing things wrong, you are going to get it wrong. Trust the experts, when they say it is a bad idea, but don't assume why it is a bad idea can be explained in a short blog article like this.UPDATED: Added an explanation as to why it might be dangerous to just take this article at its word.

评论 #4089209 未加载

评论 #4089243 未加载

评论 #4089179 未加载

Ralzalmost 13 years ago

I e-mailed visa about something similar with their new upcoming V.me service. They suggest that you use md5 to generate the tag which is known to be weaker than SHA-1. I was a little surprised that a company like Visa would mess up on crypto and not know to use HMAC instead of just a simple hash. Never heard a response from them either.Here's what their documentation says:Language Standard Syntax for Generating MD5 Hash Java import org.apache.commons.codec.digest.*; hash = DigestUtils.md5Hex(string1+string2+string3...); PHP $hash = md5($string1.$string2.$string3...); Ruby require 'digest/md5' hash = Digest::MD5.hexdigest(string1+string2+string3...) Python import md5 hash = md5.new(string1+string2+string3...)

评论 #4089246 未加载

评论 #4089234 未加载

loegalmost 13 years ago

Tl;dr: Use HMAC for Hash-based Message Authentication Codes and hash functions for hash functions. Don't use them the other way around.PS, maybe more developers should take an intro course on crypto.

评论 #4089368 未加载

klodolphalmost 13 years ago

> Finally, you should make sure your application does not exit early if the tag is invalid. You should do all the data processing you would normally do, just short of modifying the database, and check the tag last. If you return early you risk another timing attack.What kind of timing attack is that? In order for there to be a timing attack, there has to be a difference in the timings.1. You can either process the data, check the authentication code, then commit.2. Or you can check the authentication code, process the data, then commit.I don't see any attacks on #2 that couldn't also work on #1.

评论 #4088967 未加载

Mithrandiralmost 13 years ago

Number one thing I learned from the Coursera class: don't build your own crypto.

评论 #4089082 未加载

评论 #4089722 未加载

theunixbeardalmost 13 years ago

The title is sort of linkbait, as in fact what it should be is "Never use hash functions vulnerable to extension attacks"... (And most common ones are) With that said, this stuff is pretty cool and after reading that the author learned all this in the Coursera Cryptography class I decided to sign up for it. (Starts June 11th)

评论 #4088962 未加载

评论 #4088989 未加载

评论 #4088953 未加载

jebbluealmost 13 years ago

>> This fact means that an attacker can determine the first correct character of the tag by submitting requests to a signed URL with a different first character in the tag each time, and stopping when the request takes a little longer than usual. After guessing the first character they can move onto the second, and so on until they’ve guessed the whole correct tag.Wouldn't this attack be eliminated by using iptables rate limiting to reduce the attack window of opportunity?

评论 #4089006 未加载

评论 #4089013 未加载

bemmualmost 13 years ago

For the string comparison, could you really use that in a timing attack. Wouldn't the difference between comparison taking one char longer be measured in nanoseconds, while the overall network lag would be milliseconds?

评论 #4089083 未加载

评论 #4088971 未加载

评论 #4089417 未加载

评论 #4088973 未加载

spicyjalmost 13 years ago

> The easiest way to defeat this attack is, instead of directly comparing two strings, compare their mappings under a collision-resistant hash function.Is this really the best way to compare strings without giving away timing info?

评论 #4088975 未加载

评论 #4088980 未加载

ajdeconalmost 13 years ago

Stupid question: Whenever I've used GPG to sign an email, it includes a line saying "Hash: SHA1". Does this imply PGP-signed messages are vulnerable to this, or does PGP/GPG do something different/smarter?

jiggy2011almost 13 years ago

One thing I'm slightly confused about here.The article says:"This sequence is then folded using a compression function h(). The details of the h() depend on the hashing function, but the only thing that concerns us here is that the compression function takes two message blocks and returns another block of the same size."So if the chunks in a SHA-1 hash are 512 bits each then surely the output of the hash function would be 512 bits rather than the 160 bit digest?Edit: the IV is 160 bits , so "another block of the same size" means each derivative block is the size of the IV not of the actual data.

more_originalalmost 13 years ago

RFC 2104 specifies how you should do it, see e.g. <a href="http://de.wikipedia.org/wiki/Keyed-Hash_Message_Authentication_Code" rel="nofollow">http://de.wikipedia.org/wiki/Keyed-Hash_Message_Authenticati...</a>The Handbook of Applied Cryptography, Chapter 9 (free online: <a href="http://cacr.uwaterloo.ca/hac/" rel="nofollow">http://cacr.uwaterloo.ca/hac/</a>) nicely explains the reasons.

terangdomalmost 13 years ago

In order for an extension attack, wouldn't the blocks have to align perfectly? Like suppose I hash [abcd][efgh][k]How would you extend that?

评论 #4089048 未加载

评论 #4089073 未加载

seatsalmost 13 years ago

tptacek or others with domain knowledge-Is the timing attack hardening suggested in the blog post a standard approach?If I was trying to attack a system and knew loosely that they did what he suggested (hashing then comparing vs comparing with timing exposed) , my untrained instinct would be that this is the weakest part. In other words I think this just makes the timing attack a little more difficult, but still possible, by producing specific hashes that carry out the timing attack.When I've needed to harden comparisons against timing attacks, I've always just used constant time comparison functions, such as these -><a href="http://codahale.com/a-lesson-in-timing-attacks/" rel="nofollow">http://codahale.com/a-lesson-in-timing-attacks/</a> <a href="http://rdist.root.org/2010/01/07/timing-independent-array-comparison/" rel="nofollow">http://rdist.root.org/2010/01/07/timing-independent-array-co...</a>

quotemstralmost 13 years ago

Also, HMAC is just one MAC (message authentication code). OMAC and is another good one; it has the interesting property of being built on top of a block cipher instead of a hash function, which can reduce the number of "moving parts" in a system if you're using a block cipher of some sort anyway.

ma2rtenalmost 13 years ago

I was wondering about that timing attack. Is that really possible? How many requests would you have to make until you can get reliable statistics over the timing of a string comparison, when you have network delays, other requests and all kinds of stuff that influence timing?

评论 #4089560 未加载

exitalmost 13 years ago

what's wrong with hashing (message + secret) instead?

评论 #4089040 未加载

评论 #4089071 未加载

评论 #4089229 未加载

einhverfralmost 13 years ago

This seems to me like a variant of "do not trust the client." Good info though. I have learned a lot more about how hash algorythms work. I do wonder though if fixed-param hashes are relatively safe due to the inability to add suffixes.

评论 #4092546 未加载

X-Istencealmost 13 years ago

If I remember correctly this was the same issue that Flickr had with their API calls at one point in time!