I like that this algorithm shows the strength of non-determinism. You simply cannot achieve lower than O(log n) bits in a deterministic way for a counter.<p>It's also neat that you can trivially extend this to O(log log ... log n) by making the increment probability 2^(2^(... 2^(-X), all the way up to O(log* n).<p>To the ones questioning the usefulness of this, note that this is essentially equivalent to floating point arithmetic, where X is the exponent (and eps dictates mantissa size). This shows you can do all sorts of unbiased operations just like you would with floating point values. For example, you can add counter A and B e.g. A=10 B=8 by taking C="A+B"=max(A,B)+{1 with prob 2^-(|A-B|+1)}=10+{1 with prob 2^-3}.
Honest question: What's the use of compressing the counter for n smaller than log(n) bits? Even if you were counting something on the order of 10^30, log n is only around 100 bits. Wouldn't storing the algorithm take more space than you'd save through this counter?