TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Base45 Data Encoding

93 点作者 b5将近 4 年前

11 条评论

Confiks将近 4 年前
Note that this encoding isn&#x27;t of the same efficiency as QR binary mode, as it converts 3 bytes into 2 base45 characters. So it&#x27;s more like &#x27;base41 using the base45&#x27; charset.<p>I&#x27;m still a bit sad that with this standard and the packages available now, the namespace of &#x27;base45&#x27; is clobbered with this suboptimal implementation. It can best just be renamed to &#x27;base41&#x27;. It&#x27;s a good tradeoff for the DCC, but not for the rest of possible implementations.<p>For the Dutch variant of the green pass using unlinkable signatures [1], we need all the space we can get, so we use a base45 encoding that uses the exact same method as base58 [2][3], and which has the exact same efficiency as QR binary mode.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;minvws&#x2F;nl-covid19-coronacheck-app-coordination&#x2F;blob&#x2F;main&#x2F;architecture&#x2F;Privacy%20Preserving%20Green%20Card.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;minvws&#x2F;nl-covid19-coronacheck-app-coordin...</a><p>[2] <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;confiks&#x2F;8fcb480d87a50cf1bb5e40e2f0930fad" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;confiks&#x2F;8fcb480d87a50cf1bb5e40e2f093...</a><p>[3] <a href="https:&#x2F;&#x2F;github.com&#x2F;confiks&#x2F;base45-go&#x2F;tree&#x2F;main&#x2F;base45" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;confiks&#x2F;base45-go&#x2F;tree&#x2F;main&#x2F;base45</a>
评论 #27628847 未加载
评论 #27628403 未加载
radicalbyte将近 4 年前
A nice tidbit: this RFC has its roots in the EU Covid Certificate project. The encoding was designed to cut the size of the QR payload (which for DCC is a CBOR - binary encoded - object) :)<p>The smaller the payload, the better and faster the scanning. Which is important for something that is designed to be used during border crossings and the like.<p>We have a number of implementations here:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;ehn-dcc-development&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ehn-dcc-development&#x2F;</a>
评论 #27628460 未加载
评论 #27634230 未加载
EdSchouten将近 4 年前
The idea of this encoding is to store two bytes of data in three characters. To me it&#x27;s not obvious why you need a base as high as 45 for that.<p>Assuming you either want to store two bytes, or a trailing one, you have 256*256 + 256 combinations: 65792. Using three base45 characters, you can get up to 45^3=91125 combinations. It looks like base41 would have been sufficient. That way you can get rid of some of those special characters, making it easier to use through different transports.
评论 #27628551 未加载
评论 #27628371 未加载
kstenerud将近 4 年前
Unfortunately, the QR code &quot;binary&quot; mode specification defaults to ISO 8859-1 for the encoding (because it was not originally intended to store actual binary data), and there&#x27;s also no way to indicate what format is actually encoded. So all decoders of course just assume ISO 8859-1 because they have no way of knowing otherwise.<p>However, we could in theory get around this by using binary data formats that always begin with an invalid text character (such as 0x80-0x9f). This way, an implementation can know that the data is not ISO 8859-1, and try to decode whatever format it discovers through the beginning byte signature.<p>I&#x27;ve actually put this into Concise Encoding [1]<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;cbe-specification.md#version-specifier" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;cb...</a>
nly将近 4 年前
For context, it seems this encoding was covered by this discussion a few days ago:<p>What&#x27;s Inside the EU Green Pass QR Code?<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27589913" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27589913</a>
codeflo将近 4 年前
So instead of extending QR codes, which are inherently binary, to efficiently handle binary payloads, we invent yet another ASCII-based tunneling scheme. Why ever fix any problem when we can just pile workaround upon workaround upon workaround?
评论 #27628036 未加载
评论 #27630687 未加载
评论 #27628038 未加载
评论 #27628252 未加载
mjevans将近 4 年前
This could be useful for storing non-ascii armored crypto keys within a printed QR code format that can be stored in a fireproof safe and reasonably OCRed and decoded for use in disaster recovery or other applications.
chrismorgan将近 4 年前
So, it’s using a 45-character alphabet which matches the QR code alphanumeric values table, which lets the QR code encoder switch to a more efficient mode that takes less space.<p>I just tried rendering a QR code of the 692 characters of the introductory paragraph (with lines joined appropriately), and compared it with a QR code of the same text, uppercased and with out-of-range characters `,`, `[` and `]` changed to %. This reduced an 89×89 code down to 77×77, a 24% reduction in area. If this is roughly the ratio, then Base45-encoding binary data by QR code will yield roughly 17% area savings compared with Base64. (Base45 gets 50% bloat, then 24% shrink = multiple of 1.14; Base64 gets 33% bloat = multiple of 1.33; Base45 &#x2F; Base64 = 1.14 &#x2F; 1.33 = 0.83.)<p>[<i>Edit:</i> edflsafoiewq’s figures on 40-L codes at <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27627915" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27627915</a> come to about 23% savings, markedly more than my 17%.]<p>I can’t help but wonder if any of the other modes could be more efficient still—numeric mode, kanji mode and byte mode.<p>Of course, the specs for all this are ISO specs, so I can’t read them without coughing up the moolah.<p>In case it’s not clear, I am utterly inexpert in this domain.<p>Further thoughts:<p>Base45 encoding two octets in three characters is pretty wasteful: 45³ ÷ 256² ≈ 1.39, which is 39% waste. (By contrast, Base64 is 100% efficient with its alphabet: 64⁴ = 256³.) This means that if you were willing to do more complex encoding and decoding, you could shrink your QR code by roughly 39% more—to about 52% of the size of the Base64, rather than 83%. Leaving such a huge gap on the table puzzles me—I’d have thought that either you’d want something simple (where Base64 is well-understood) or want to minimise your QR codes, and Base45 sits in an awkward place in the middle.<p>For UTF-8, base-128 will be the most efficient you get. That’ll be ~14% inflation (7 bytes in 8 characters). Which… huh, that looks to be within ε of Base45’s 50% bloat and 24% shrinkage. Not sure if that’s a coincidence or not because I don’t know how alphanumeric mode versus byte mode works in QR codes. But this suggests that alphanumeric mode and Base45-but-not-wasteful would be markedly more efficient than byte mode and Base-122. Still leaves numeric and kanji modes open as possibilities. Again, I’m inexpert and don’t know how the encodings are actually done, and that’ll matter.<p>On edflsafoiewq’s 40-L figures: Base64 gets 2214 bytes, Base45 gets 2864 bytes, optimally-efficient base-45 would get log₂₅₆ 45⁴²⁹⁶ = 2949 bytes, only around 3% more. I think I must have made a mistake somewhere with some of my numbers.
评论 #27628078 未加载
评论 #27627996 未加载
评论 #27628354 未加载
eventreduce1将近 4 年前
What are the benefits to base58?<p>Base45 uses chars like backslash. This is super annoying when the encoded string is used in an url.
评论 #27627915 未加载
评论 #27627826 未加载
评论 #27628672 未加载
simojk将近 4 年前
How does this compare with Data Matrix and C40 encoding?
upofadown将近 4 年前
Why would this be a RFC? It seems quite specific to QR codes. Nothing specific to the Internet.