TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: "crzy64", base64 mod aimed for fastest decoding

47 点作者 jpegqs超过 3 年前

7 条评论

cornstalks超过 3 年前
I want to say this looks interesting but it&#x27;s so sparse on details it&#x27;s hard to say. What modifications were made to base64? Is the <i>only</i> modification replacing &quot;+&quot; with &quot;.&quot;? Is the rest of the table[1] unchanged? Can it be extended to only use URL-safe characters (like base64url)? How does this compare to other popular&#x2F;optimized implementations of standard base64? What are its performance characteristics on different sizes of inputs? I&#x27;ve never had to decode 100 MB of base64 data, but I have had to decode a ton of 64-bit base64 strings, for example.<p>[1]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Base64#Base64_table" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Base64#Base64_table</a>
评论 #29645445 未加载
lifthrasiir超过 3 年前
It would have been better to have an algorithm description somewhere, but it is not hard to follow. So crzy64 is essentially base64 plus two optimizations:<p>- Output bytes [A-Za-z0-9.&#x2F;] are shuffled so that it can be efficiently translated from and to the 0..63 range. The exact remapping is as follows:<p><pre><code> cDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz.&#x2F;0123456789AaBbC </code></pre> - Bits are reordered and <i>XORed with each other</i> to make the unpacking from three 8-bit bytes to four 6-bit units extremely simple: specifically, `x ^ (x &gt;&gt; 6)`. The input bits `qrstuvwx ijklmnop abcdefgh` thus should be converted to the following four 6-bit units, XORed together (here represented with two-bit padding as in the actual implementation):<p><pre><code> 00abcdef 00ijklgh 00qrmnop 00stuvwx 0000ijkl 00000000 0000gh00 00mnop00 000000qr 00000000 00000000 00gh0000 </code></pre> Since both optimizations can equally work well for SIMD and for SWAR (SIMD within a register), I guess they may even be useful for small inputs.
评论 #29659090 未加载
vortico超过 3 年前
Isn&#x27;t base64 encoding&#x2F;decoding bound to memory bandwidth? If so, won&#x27;t all 4&#x2F;3 memory formats encode&#x2F;decode at the same speed?
评论 #29639474 未加载
评论 #29638654 未加载
评论 #29639671 未加载
superjan超过 3 年前
There has been work done one optimizing regular base64 using SSE:<p><a href="http:&#x2F;&#x2F;www.alfredklomp.com&#x2F;programming&#x2F;sse-base64&#x2F;" rel="nofollow">http:&#x2F;&#x2F;www.alfredklomp.com&#x2F;programming&#x2F;sse-base64&#x2F;</a>
评论 #29645507 未加载
kazinator超过 3 年前
&gt; <i>There is a difference with base64 as it uses &quot;.&#x2F;&quot; instead of &quot;+&#x2F;&quot; and the data is also pre-shuffled to speed up decoding.</i><p>You know what else uses dot slash? The Unix crypt function.
rep_movsd超过 3 年前
Can you tell us what set of 64 characters it encodes to?
评论 #29639340 未加载
bellyfullofbac超过 3 年前
I wonder if they just reinvented UU-(en,de)code: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Uuencoding" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Uuencoding</a><p>I guess it&#x27;ll go... nowhere.
评论 #29642292 未加载