TechEcho

16 comments

imurrayalmost 14 years ago

For interest rather than practical use:Creating a 2GiB lookup table is ~10% faster on my machine than method #10. That is, faster on the benchmark in the blog post. A 2GiB lookup table is horrible, has a little setup cost dominated by calloc-ing the array, and would trash caches with real code.A also made a solution using a union to split x into two shorts and a 64kiB lookup table that was ~15% slower. For more expensive functions, lookup tables are an annoying baseline to beat. (Although still often good to avoid because of dealing with setup, cache problems, etc.)

评论 #2808630 未加载

评论 #2808870 未加载

omarantoalmost 14 years ago

It seems bizarre to call the first group of methods "decimal based" when the methods don't ever look at the decimal digits of the number being tested. I would call them "arithmetic" or something like that.

评论 #2808187 未加载

评论 #2807819 未加载

bad_useralmost 14 years ago

<pre><code> return ((x != 0) && !(x & (x - 1))); </code></pre> This is beautiful.

评论 #2809062 未加载

评论 #2808733 未加载

评论 #2807799 未加载

评论 #2808102 未加载

aidenn0almost 14 years ago

The benchmarks are crap. The loop/function call overhead time dominates. They should list the benchmark for when the implementation is "return x;" That gives a baseline number. On my machine I have to run a few thousand runs before I can distinguish between "return x;" and "return ((x != 0) && !(x & (x - 1)));"They both take about 10s for 232 iterations.

dasmothalmost 14 years ago

Recent Intel/AMD CPUs have a POPCNT instruction, which seems like the logical way to do this. Would be interested to see how that performs compared to these implementations.

评论 #2807728 未加载

评论 #2807918 未加载

评论 #2809286 未加载

imurrayalmost 14 years ago

A nasty solution, which assumes and abuses IEEE floating point format and demonstrates a couple of things.<pre><code> int isPowerOfTwo (unsigned int x) { int exponent; union { unsigned int u; float f; } tmp; tmp.f = x; exponent = (tmp.u >> 23) - 127; return x == (1 << exponent); } </code></pre> One can also cast to a double without losing precision, mask out the exponent and then compare to 1.0. That solution is even nastier, and needs #define's to deal with endianness.Obviously the above solution is not a good idea! Amongst the several problems, casting to floats is really slow. Sometimes I store my integers in doubles throughout my code because it saves conversions, and can be more convenient. (Matlab users routinely store integers as doubles.)What I found interesting/disconcerting was that the above function doesn't compile reliably. When using 'gcc -Wall' I get isPowerOfTwo(0)==0, whereas with 'gcc -Wall -O2' I get isPowerOfTwo(0)==1. clang has the same change in behaviour with optimization levels.

评论 #2811960 未加载

ChristianMarksalmost 14 years ago

The decrement test for a power of two can be modified to count bits and runs in log n time.<pre><code> int bits(unsigned n) { int i = 0; while (n > 0) { n &= n-1; i++ } return i; }</code></pre>

评论 #2809411 未加载

Someonealmost 14 years ago

I haven't tried, but I would expect doing the linear search in the reverse direction will be faster than binary search. It, on average, inspects just two values; binary search does about five.

gjm11almost 14 years ago

Here's another. It's strictly inferior to the x&(x-1) one, but the idea that makes it work is so pretty it seems worth mentioning.<pre><code> y = x | (x>>1); y |= y>>2; y |= y>>4; y |= y>>8; y |= y>>16; y |= y>>32; // for 64-bit numbers; return y == (x<<1)-1; </code></pre> So what's going on here? Well, it's easy enough to locate the lowest set bit in x -- it's x & -x, the same basic idea as the x&(x-1) trick -- but what the code above does is to locate the highest by "filling" rightwards from that bit. After the first line, the 1-bits in y are the 1-bits in x, and the bits one to the right of them. After the second, it's the 1-bits in x shifted right by 0..3 places. After the next, 0..7 places. And so on. Eventually, that comes to "all the 1-bits in x, and everything any distance to their right". So, e.g., 000101000100 -> 000111100110 --> 000111111111 which then doesn't change.The nice thing is that the number of steps this takes goes like log(word width). It's in the same spirit as the 0x33333333 popcount trick.(On some processors -- all x86 ones since 386, for instance -- there's an instruction that does something similar directly, but on at least some x86 processors it's rather slow.)

zxwalmost 14 years ago

Is there a reason this won't work? It's the most 'readable' way I could come up with.<pre><code> #(python code) def is_power_of_two(n): import math if n <= 0: return False power = round(math.log(n, 2)) return 2 ** power == n</code></pre>

评论 #2808563 未加载

评论 #2807835 未加载

评论 #2807836 未加载

Lambent_Cactusalmost 14 years ago

I hadn't considered #2, the Check All option. I love its refusal to be drawn into over-engineering.

btillyalmost 14 years ago

Most of the solutions explicitly assume 32 bit integers. These days most of us have 64 bit integers available.

评论 #2808782 未加载

rcfoxalmost 14 years ago

I think it would be better to list the worst- and average-case asymptotic analyses of each method, rather than just the run-times. For instance, the "Decimal-Based Approaches to Checking for Powers of Two" will take longer depending on the size of your number.

mansralmost 14 years ago

One they missed: (x & -x) == x

评论 #2807599 未加载

评论 #2807604 未加载

评论 #2807601 未加载

评论 #2807447 未加载

sharthalmost 14 years ago

Use popcount?return _mm_popcnt_u64(x) < 2;

kqueuealmost 14 years ago

if x is unsigned, then:(x & (x - 1)) == 0

16 comments

imurrayalmost 14 years ago

评论 #2808630 未加载

评论 #2808870 未加载

omarantoalmost 14 years ago

评论 #2808187 未加载

评论 #2807819 未加载

bad_useralmost 14 years ago

<pre><code> return ((x != 0) && !(x & (x - 1))); </code></pre> This is beautiful.

评论 #2809062 未加载

评论 #2808733 未加载

评论 #2807799 未加载

评论 #2808102 未加载

aidenn0almost 14 years ago

dasmothalmost 14 years ago

Recent Intel/AMD CPUs have a POPCNT instruction, which seems like the logical way to do this. Would be interested to see how that performs compared to these implementations.

评论 #2807728 未加载

评论 #2807918 未加载

评论 #2809286 未加载

imurrayalmost 14 years ago

评论 #2811960 未加载

ChristianMarksalmost 14 years ago

The decrement test for a power of two can be modified to count bits and runs in log n time.<pre><code> int bits(unsigned n) { int i = 0; while (n > 0) { n &= n-1; i++ } return i; }</code></pre>

评论 #2809411 未加载

Someonealmost 14 years ago

I haven't tried, but I would expect doing the linear search in the reverse direction will be faster than binary search. It, on average, inspects just two values; binary search does about five.

gjm11almost 14 years ago

zxwalmost 14 years ago

评论 #2808563 未加载

评论 #2807835 未加载

评论 #2807836 未加载

Lambent_Cactusalmost 14 years ago

I hadn't considered #2, the Check All option. I love its refusal to be drawn into over-engineering.

btillyalmost 14 years ago

Most of the solutions explicitly assume 32 bit integers. These days most of us have 64 bit integers available.

Ten Ways to Check if an Integer Is a Power Of Two in C

16 comments

Ten Ways to Check if an Integer Is a Power Of Two in C

16 comments