TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Finding the average of two unsigned integers without overflow

448 pointsby mlexover 3 years ago

31 comments

errcorrectcodeover 3 years ago
Having done computer architecture and bit twiddling x86 in the ye olden days, I immediately, independently converged on the patented solution (code &#x2F; circuit &#x2F; Verilog, more or less the same thing). It goes to show how broken the USPTO is because it&#x27;s obvious to anyone in the field. Patents are supposed to be nonobvious. (35 USC 103)<p><a href="https:&#x2F;&#x2F;patentdefenses.klarquist.com&#x2F;obviousness-sec-103&#x2F;" rel="nofollow">https:&#x2F;&#x2F;patentdefenses.klarquist.com&#x2F;obviousness-sec-103&#x2F;</a>
评论 #30254129 未加载
评论 #30256427 未加载
评论 #30255154 未加载
评论 #30256830 未加载
评论 #30254715 未加载
评论 #30255671 未加载
评论 #30256735 未加载
评论 #30258677 未加载
评论 #30254441 未加载
评论 #30254186 未加载
ridiculous_fishover 3 years ago
The &quot;SWAR&quot; approach `(a &amp; b) + (a ^ b) &#x2F; 2` looks bizarre but can be understood.<p>Adding two bits produces a sum and a carry:<p><pre><code> 0 + 0 = 0, carry 0 1 + 0 = 1, carry 0 0 + 1 = 1, carry 0 1 + 1 = 0, carry 1 </code></pre> So the sum is XOR, and the carry is bitwise AND. We can rewrite x + y as (x ^ y) + (x &amp; y)*2<p>Distribute the divide, and you get (x ^ y)&#x2F;2 + (x &amp; y) which is the mystery expression.<p>(Note this distribution is safe only because (x &amp; y)*2 is even.)
评论 #30253588 未加载
评论 #30253844 未加载
评论 #30256108 未加载
评论 #30253712 未加载
评论 #30254609 未加载
worewoodover 3 years ago
Just by reading the headline, before opening the article, I thought of the patented solution in my head. &quot;Just halve before adding, it can be off by one but some boolean logic might do it&quot;<p>Software patents are absolutely disgusting.
评论 #30254018 未加载
评论 #30254232 未加载
评论 #30254227 未加载
justin66over 3 years ago
See also: &quot;Nearly All Binary Searches and Mergesorts are Broken&quot; by Joshua Bloch. The cluefulness or otherwise with which people often react to Bloch&#x27;s excellent post is not something to ponder very closely if you want to retain any hope in the future of software engineering.<p><a href="https:&#x2F;&#x2F;ai.googleblog.com&#x2F;2006&#x2F;06&#x2F;extra-extra-read-all-about-it-nearly.html" rel="nofollow">https:&#x2F;&#x2F;ai.googleblog.com&#x2F;2006&#x2F;06&#x2F;extra-extra-read-all-about...</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=3530104" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=3530104</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1130463" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=1130463</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14906429" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14906429</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6799336" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6799336</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9857392" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9857392</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12147703" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12147703</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=621557" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=621557</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7594625" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7594625</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9113001" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9113001</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16890739" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16890739</a><p>If doomscrolling all that isn&#x27;t enough to make you fear for mankind&#x27;s future I&#x27;m pretty sure there&#x27;s an Ulrich Drepper glibc bug report rejection related to this topic (or several) that you can google...<p>On topic: Raymond&#x27;s post has some other great stuff. SWAR!
评论 #30253300 未加载
评论 #30253987 未加载
评论 #30253842 未加载
评论 #30254930 未加载
johnhenryover 3 years ago
I saw the title and thought to just do &quot;(a &#x2F; 2) + (b &#x2F; 2)&quot; and a do a little bit of fudging if a or b is odd.<p>After reading the article, learning that<p><pre><code> unsigned average(unsigned a, unsigned b) { return (a &#x2F; 2) + (b &#x2F; 2) + (a &amp; b &amp; 1); } </code></pre> was once patented actually made me a bit sad for our entire system of patents.
评论 #30255176 未加载
nmiloover 3 years ago
<p><pre><code> There’s another algorithm that doesn’t depend on knowing which value is larger, the U.S. patent for which expired in 2016: unsigned average(unsigned a, unsigned b) { return (a &#x2F; 2) + (b &#x2F; 2) + (a &amp; b &amp; 1); } </code></pre> There&#x27;s no way that should be patentable.
评论 #30253536 未加载
评论 #30253463 未加载
评论 #30253420 未加载
评论 #30253406 未加载
评论 #30253456 未加载
评论 #30253440 未加载
评论 #30262824 未加载
评论 #30253454 未加载
评论 #30253513 未加载
评论 #30253759 未加载
rustyboltover 3 years ago
&gt; There’s another algorithm that doesn’t depend on knowing which value is larger, the U.S. patent for which expired in 2016.<p>That&#x27;s completely retarded; it&#x27;s literally the first solution I think of when I hear this problem.
评论 #30255645 未加载
评论 #30255994 未加载
dralleyover 3 years ago
&gt; I find it amusing that the PowerPC, patron saint of ridiculous instructions, has an instruction whose name almost literally proclaims its ridiculousness: rldicl. (It stands for “rotate left doubleword by immediate and clear left”.)<p>I suspect the POWER team has a good sense of humor. There&#x27;s also the EIEIO instruction <a href="https:&#x2F;&#x2F;www.ibm.com&#x2F;docs&#x2F;en&#x2F;aix&#x2F;7.2?topic=set-eieio-enforce-in-order-execution-io-instruction" rel="nofollow">https:&#x2F;&#x2F;www.ibm.com&#x2F;docs&#x2F;en&#x2F;aix&#x2F;7.2?topic=set-eieio-enforce-...</a>
benlivengoodover 3 years ago
Before reading the article: In x86 assembly, add ax, bx ; rcr ax, 1 works. I guess technically that is with overflow, but using overflow bits as intended.<p>EDIT: it&#x27;s included in the collection of methods in the article as expected.
评论 #30259269 未加载
AnotherGoodNameover 3 years ago
I noticed the following is in the middle of the article with no context that no one else is mentioning:<p><pre><code> unsigned average(unsigned a, unsigned b) { return (a &amp; b) + (a ^ b) &#x2F; 2; } </code></pre> A quick sanity check of this<p>23 &amp; 21 = 21<p>23 ^ 21 = 2<p>21 + 2 &#x2F; 2 = 22 (order of operations)<p>I wonder why this is there. It seems the best solution but no one else is mentioning it. It also has no context near it. Nor is it stated correctly. It&#x27;s just there on it&#x27;s own.
评论 #30253683 未加载
评论 #30253652 未加载
评论 #30253646 未加载
yaloginover 3 years ago
I cannot believe that solution was allowed to be patented. How crappy is our patent process? Most engineers writing code would come up with that solution first.
评论 #30255114 未加载
classichasclassover 3 years ago
He hinted at this obliquely, but the PowerPC family of bit rotate instructions (ridicl, rlwinm, rlwimi, etc.), although intimidating in the general case, allows shifting, rotation, insertion, masking and more. There are many alternative mnemonics to try to reduce the cognitive complexity but all of these just assemble to them with particular parameters.
leptoniscoolover 3 years ago
This seems fundamental, surprised elementary operations hasn&#x27;t been made a part of every major language&#x2F;framework.
评论 #30253983 未加载
Findecanorover 3 years ago
As an asm geek, I wasn&#x27;t surprised to read that taking advantage of the carry flag yielded the most efficient code for some processors. I recalled that some ISAs also have special SIMD instructions specifically for unsigned average, so I looked them up:<p>* x86 SSE&#x2F;AVX&#x2F;AVX2 have (V)PAVGB and (V)PAVGW, for 8-bit and 16-bit unsigned integers. These are &quot;rounding&quot; instruction though: adding 1 to the sum before the shift.<p>* ARM &quot;Neon&quot; has signed and unsigned &quot;Halving Addition&quot;. 8,16 or 32 bit integers. Rounding or truncating.<p>* RISC-V&#x27;s new Vector Extension has instructions for both signed and unsigned &quot;Averaging Addition&quot;. Rounding mode and integer size are modal.<p>* The on-the-way-out MIPS MSA set has instruction for signed, unsigned, rounded and truncated average, all integer widths.<p>Some ISAs also have &quot;halving subtraction&quot;, but the purpose is not as obvious.
unwindover 3 years ago
Very cool.<p>I was surprised that the article didn&#x27;t mention the need for this in binary search, and the famous problems [1] that occured due to naive attempts.<p>[1]: <a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Binary_search_algorithm" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Binary_search_algorithm</a>
ncmncmover 3 years ago
&gt; <i>gcc doesn’t have a rotation intrinsic, so I couldn’t try it there</i><p>Gcc and Clang both recognize the pattern of shifts and OR that reproduce a rotation, and substitute the actual instruction, no intrinsic needed.<p>I bet MSVC does too.
评论 #30272284 未加载
user-the-nameover 3 years ago
&quot;unsigned long long&quot;?<p>It&#x27;s 2022. stdint.h is old enough to drink, and is probably married with a kid on the way. Just include it already?
MaxBarracloughover 3 years ago
Reminds me of a Stack Overflow thread, <i>Shortest way to calculate difference between two numbers?</i> [0] Multiple answers ignored the possibility of overflow.<p>[0] <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;q&#x2F;10589559&#x2F;" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;q&#x2F;10589559&#x2F;</a>
favoritedover 3 years ago
Marshall Clow gave a pretty excellent CppCon talk covering these exact problems, called &quot;std::midpoint? How Hard Could it Be?&quot; <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=sBtAGxBh-XI" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=sBtAGxBh-XI</a>
评论 #30255008 未加载
Subsentientover 3 years ago
Eh. I just cast both to a bigger integer type where possible, which in practice, is almost always. So if I&#x27;m averaging two uint32_ts, I just cast them to uint64_t beforehand. Or in Rust, with its lovely native support for 128-bit integers, I cast a 64-bit integer to 128-bit.
everyoneover 3 years ago
I find it mind boggling that something as simple as this can actually be patented.<p><pre><code> unsigned average(unsigned a, unsigned b) { return (a &#x2F; 2) + (b &#x2F; 2) + (a &amp; b &amp; 1); } </code></pre> That makes the patent system seem broken to me.
mark-rover 3 years ago
This was a lot more thorough and in-depth than I expected it to be. But that&#x27;s Raymond Chen for you.<p>One of the reasons I love Python is that integers never overflow, so this becomes a trivial problem.
评论 #30255294 未加载
评论 #30255617 未加载
Beldinover 3 years ago
Since this discussion is all about patents: my 2 cents on improving the patent system.<p>Consider a term project of an undergraduate CS course, where the goal is spelled out, but the method is left for discovery.<p>Methods developed within any such project immediately invalidate patents. They&#x27;re apparently obvious to folks learning to become &quot;skilled in the art&quot;.<p>Yes, in practice, reaching a legal threshold would be hard (are you sure the students didn&#x27;t read the patent or any description directly resulting from it?). But I&#x27;d definitely run a &quot;patent invalidation course&quot; - if I had confidence that the results would actually affect patents.
mirkodrummerover 3 years ago
Wait, what? How can a patent right be applied on a one line of code that eventually is compiled down to machine code? Sounds ridicolous to me
bufferoverflowover 3 years ago
Isn&#x27;t it better to do<p><pre><code> (a&gt;&gt;1) + (b&gt;&gt;1) + (a&amp;b&amp;1) </code></pre> No division needed.
评论 #30254021 未加载
评论 #30254047 未加载
dpacmittalover 3 years ago
You could also do (a + (b-a)&#x2F;2) where a is the smaller number.
dathinabover 3 years ago
how is turning (a+b)&#x2F;2 into a&#x2F;2 + b&#x2F;2 + a&amp;b&amp;1 even patentable?<p>Turning (a+b)&#x2F;2 into a&#x2F;2 + b&#x2F;2 is basic obvious math.<p>If you do it and to any basic testing you will realize you are getting of by one errors, locking at them can then make it obvious that when they appear and hence how to fix them.<p>Sure a proof is more complex, but then you can just trivially test it for all smaller-bit numbers over all possible inputs, hence making proofs unnecessary (for that numbers).<p>This is a solution a not yet graduated bachelor student can find in less then a day.<p>Having granted a patent for this should lead to disciplinary measurements against the person granting the patent tbh.
phs318uover 3 years ago
I got a pang of nostalgia seeing the Alpha AXP instructions.
avmichover 3 years ago
Does this all work with BCD encoding?
blobbersover 3 years ago
This guy hacks.
kingcharlesover 3 years ago
Some unreal solutions here that show how amazing mathematics can be. Especially that Google patented method that only just recently expired.<p>Props for including the assembler breakdown for every major CPU architecture.
评论 #30253319 未加载
评论 #30253371 未加载