TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

8-bit number to binary string

158 pointsby Audiophilipover 9 years ago

13 comments

acqqover 9 years ago
The big constants used in the formula<p><pre><code> void to_bin(unsigned char c, char *out) { *(unsigned long long*)out = 3472328296227680304ULL + (((c * 9241421688590303745ULL) &#x2F; 128) &amp; 72340172838076673ULL); } </code></pre> are much more obvious when they remain in hex:<p>0x3030303030303030 -- the byte array of &#x27;0&#x27;s encoded as ASCII<p>0x8040201008040201 -- the magic<p>0x101010101010101 -- only 0 or 1 to be added to the appropriate array element<p>Therefore, more readable:<p><pre><code> void to_bin(unsigned char c, char *out) { &#x2F;* endian dependent, works on x86 *&#x2F; *(unsigned long long*)out = 0x3030303030303030ULL + (((c * 0x8040201008040201ULL) &gt;&gt; 7) &amp; 0x0101010101010101ULL); } </code></pre> The function is of course endian dependent (I&#x27;d add a comment about that fact in it) and the explanation of the quoted function on the page starts at &quot;If you shift the the constant by 7 bits to the right.&quot; The page is a result of an edit where the older function (that ends with + c &#x2F; 128;) was explained first.<p>Still, nice.
评论 #10515069 未加载
评论 #10513223 未加载
评论 #10517332 未加载
评论 #10513244 未加载
personjerryover 9 years ago
Awesome. I&#x27;m always amazed by bitmagic like this. Today I saw one on stackoverflow and spent a good deal of time failing to understand it:<p><a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;33532045&#x2F;how-to-swap-first-2-consecutive-different-bits" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;33532045&#x2F;how-to-swap-fir...</a><p>And of course I&#x27;m reminded of the fast inverse square root:<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Fast_inverse_square_root" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Fast_inverse_square_root</a>
nialoover 9 years ago
Why isn&#x27;t this faster than the &quot;standard&quot; loop based version? The disclaimer at the start says it&#x27;s not, but I would have thought that 4 64bit math operations would be much faster than a loop with 8 steps and comparisons and so on.
评论 #10513767 未加载
评论 #10513638 未加载
rjmunroover 9 years ago
Rather than add 0x3030303030303030, you could OR with it because you know there will never be a carry. This may save cycles on some architectures.
anewhnaccount2over 9 years ago
This is an example of SIMD Within A Register. There are some more neat examples here: <a href="http:&#x2F;&#x2F;aggregate.org&#x2F;MAGIC&#x2F;" rel="nofollow">http:&#x2F;&#x2F;aggregate.org&#x2F;MAGIC&#x2F;</a>
estover 9 years ago
For Python<p><pre><code> &gt;&gt;&gt; f1=lambda c: __import__(&#x27;struct&#x27;).pack(&#x27;&lt;Q&#x27;, 3472328296227680304 + (((c * 9241421688590303745) &#x2F; 128) &amp; 72340172838076673)) &gt;&gt;&gt; f1(1) &#x27;00000001&#x27; &gt;&gt;&gt; f1(2) &#x27;00000010&#x27; &gt;&gt;&gt; f1(5) &#x27;00000101&#x27; &gt;&gt;&gt; f2=lambda c: struct.pack(&#x27;&lt;Q&#x27;, (((c * 0x8040201008040201) &#x2F; 128) &amp; 0x0101010101010101)) &gt;&gt;&gt; f2(1) &#x27;\x00\x00\x00\x00\x00\x00\x00\x01&#x27; &gt;&gt;&gt; f2(2) &#x27;\x00\x00\x00\x00\x00\x00\x01\x00&#x27; &gt;&gt;&gt; f2(101) &#x27;\x00\x01\x01\x00\x00\x01\x00\x01&#x27; </code></pre> Use &quot;&lt;&quot; for little-endian for x86
seijiover 9 years ago
Here&#x27;s a fancier, slightly less-magic, version. How to compile is left as an exercise for the reader (hint: Haswell or newer, requires BMI2 instruction set):<p><pre><code> void toBinary(uint8_t byte) { uint64_t zeroes = *(uint64_t *)&quot;00000000&quot;; uint64_t resultStr = zeroes | __builtin_bswap64(_pdep_u64(byte, 0x0101010101010101ULL)); printf(&quot;hello: %.*s\n&quot;, 8, (char *)&amp;resultStr); }</code></pre>
评论 #10513683 未加载
zamalekover 9 years ago
&gt; x86 is little endian - that&#x27;s why it&#x27;s backwards<p>This can be read the wrong way, although it is correct: the author is referring to the final byte sequence and not the input bit sequence. Endianness applies to bytes, not bits.
eruover 9 years ago
Nice little problem. Thanks for sharing!
rlonsteinover 9 years ago
Cool but I want to say I&#x27;ve seen similar, google turns up: <a href="http:&#x2F;&#x2F;www.asmcommunity.net&#x2F;forums&#x2F;topic&#x2F;?id=28498" rel="nofollow">http:&#x2F;&#x2F;www.asmcommunity.net&#x2F;forums&#x2F;topic&#x2F;?id=28498</a>
nwmcsweenover 9 years ago
I recommend reading over this: <a href="http:&#x2F;&#x2F;0x80.pl&#x2F;articles&#x2F;convert-to-bin.html" rel="nofollow">http:&#x2F;&#x2F;0x80.pl&#x2F;articles&#x2F;convert-to-bin.html</a><p>btw all this SWAR stuff should be aligned before applying.
jjnoakesover 9 years ago
Be careful with alignment. On architectures where it matters, the caller may pass in a &quot;char* out&quot; which is not suitably aligned, and doing an &quot;unsigned long long&quot; store into that address may fault.
ameliusover 9 years ago
Feature request: add a parameter that specifies the base into which the function rewrites the number.