> The issue we run into here is ambiguous encoding.<p>What I have done in the past for this is to encode the messages as UTF-8 and separate them by 0xFF, since that byte value never occurs in UTF-8 encoding [0]. If the messages to be hashed are character strings, you have to decide on <i>some</i> encoding anyway in order to hash them.<p>[0] UTF-8 bytes always contain at least one zero bit: <a href="https://en.wikipedia.org/wiki/UTF-8#Encoding" rel="nofollow">https://en.wikipedia.org/wiki/UTF-8#Encoding</a>. Incidentally, if one wanted to create the UTF-8 equivalent of zero-terminated strings without reserving a character value (like NUL) as the sentinel value, one could use 0xFF for that.
The RFC recommendation for 1G RAM or 64MB Argon PKDF is insane. Don't follow this advice. In a real world server, any API endpoint using this advice will quickly become a DOS vector. A saner value is 1MB for Argon. It stills blocks major GPU attacks, which is the whole point.
This is a good post, but there's a little too much ritual in here about password-based KDFs for my taste. Put all the mainstream KDFs on a dartboard, yes including PBKDF2, and throw a dart. I think you'll be fine. Bcrypt, the most popular password hash, has held up surprisingly well, and scrypt might still be one of the best options.
> While Keccak doesn’t suffer from the length-extension attacks that HMAC is meant to address, the phrase “simply prepending the message with the key” carries a lot of assumptions about key length and key formatting with it.<p>A <i>lot</i> of assumptions, or just that it's fixed length?
I'm no expert and a bit tired, but: Is the problem around hashing password + salt for a key just about the fact it can be brute-forced with enough recources, or did I miss something?
At one point do I need to be concerned about this? i.e. have someone review hash generating code sites.<p>- 100s of hashes?<p>- 1000s of hashes?<p>- 1,000,000s of hashes?
It's interesting that using JSON encoding for all messages eliminates many of the problems.<p>ambiguous encoding? Nothing ambiguous about JSON, you don't even need any separator. Or merge them into json array.<p>length-extension attacks? appending non-whitespace to json makes it invalid (for sane decoders at least)
A cryptography course should be mandated in the curriculum of most universities, just so people gain some intuition about the types of attacks that are possible. Just yelling "don't roll your own crypto" isn't practical advice when most issues come from misusing primitives or combining primitives in a "weak" manner.
The article is not that different than "don't invent your own cryptography".<p>It's hard to understand for non-crypto specialists. It uses notions which are unknown to most programmers like MAC or other *MACs.<p>So not sure who is the target audience for this.