科技回声

7 条评论

cornstalks超过 3 年前

I want to say this looks interesting but it's so sparse on details it's hard to say. What modifications were made to base64? Is the only modification replacing "+" with "."? Is the rest of the table[1] unchanged? Can it be extended to only use URL-safe characters (like base64url)? How does this compare to other popular/optimized implementations of standard base64? What are its performance characteristics on different sizes of inputs? I've never had to decode 100 MB of base64 data, but I have had to decode a ton of 64-bit base64 strings, for example.[1]: <a href="https://en.wikipedia.org/wiki/Base64#Base64_table" rel="nofollow">https://en.wikipedia.org/wiki/Base64#Base64_table</a>

评论 #29645445 未加载

lifthrasiir超过 3 年前

It would have been better to have an algorithm description somewhere, but it is not hard to follow. So crzy64 is essentially base64 plus two optimizations:- Output bytes [A-Za-z0-9./] are shuffled so that it can be efficiently translated from and to the 0..63 range. The exact remapping is as follows:<pre><code> cDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz./0123456789AaBbC </code></pre> - Bits are reordered and XORed with each other to make the unpacking from three 8-bit bytes to four 6-bit units extremely simple: specifically, `x ^ (x >> 6)`. The input bits `qrstuvwx ijklmnop abcdefgh` thus should be converted to the following four 6-bit units, XORed together (here represented with two-bit padding as in the actual implementation):<pre><code> 00abcdef 00ijklgh 00qrmnop 00stuvwx 0000ijkl 00000000 0000gh00 00mnop00 000000qr 00000000 00000000 00gh0000 </code></pre> Since both optimizations can equally work well for SIMD and for SWAR (SIMD within a register), I guess they may even be useful for small inputs.

评论 #29659090 未加载

vortico超过 3 年前

Isn't base64 encoding/decoding bound to memory bandwidth? If so, won't all 4/3 memory formats encode/decode at the same speed?

评论 #29639474 未加载

评论 #29638654 未加载

评论 #29639671 未加载

superjan超过 3 年前

There has been work done one optimizing regular base64 using SSE:<a href="http://www.alfredklomp.com/programming/sse-base64/" rel="nofollow">http://www.alfredklomp.com/programming/sse-base64/</a>

评论 #29645507 未加载

kazinator超过 3 年前

> There is a difference with base64 as it uses "./" instead of "+/" and the data is also pre-shuffled to speed up decoding.You know what else uses dot slash? The Unix crypt function.

rep_movsd超过 3 年前

Can you tell us what set of 64 characters it encodes to?

评论 #29639340 未加载

bellyfullofbac超过 3 年前

I wonder if they just reinvented UU-(en,de)code: <a href="https://en.wikipedia.org/wiki/Uuencoding" rel="nofollow">https://en.wikipedia.org/wiki/Uuencoding</a>I guess it'll go... nowhere.

评论 #29642292 未加载

7 条评论

cornstalks超过 3 年前

评论 #29645445 未加载

lifthrasiir超过 3 年前

评论 #29659090 未加载

vortico超过 3 年前

Isn't base64 encoding/decoding bound to memory bandwidth? If so, won't all 4/3 memory formats encode/decode at the same speed?

评论 #29639474 未加载

评论 #29638654 未加载

评论 #29639671 未加载

superjan超过 3 年前

There has been work done one optimizing regular base64 using SSE:<a href="http://www.alfredklomp.com/programming/sse-base64/" rel="nofollow">http://www.alfredklomp.com/programming/sse-base64/</a>

评论 #29645507 未加载

kazinator超过 3 年前

> There is a difference with base64 as it uses "./" instead of "+/" and the data is also pre-shuffled to speed up decoding.You know what else uses dot slash? The Unix crypt function.

rep_movsd超过 3 年前

Can you tell us what set of 64 characters it encodes to?

评论 #29639340 未加载

bellyfullofbac超过 3 年前

I wonder if they just reinvented UU-(en,de)code: <a href="https://en.wikipedia.org/wiki/Uuencoding" rel="nofollow">https://en.wikipedia.org/wiki/Uuencoding</a>I guess it'll go... nowhere.

评论 #29642292 未加载

Show HN: "crzy64", base64 mod aimed for fastest decoding

7 条评论

Show HN: "crzy64", base64 mod aimed for fastest decoding

7 条评论