Due to multilevel caches I doubt this would be so applicable to modern systems - the index-set representation takes several times more memory and naturally requires lots of random accesses, which are bad for caches. In some ways, it's no longer a time-space tradeoff: smaller <i>is</i> faster.<p>Also, initialising the space doesn't take so long; REP STOS on x86 is <i>fast</i> since it can write an entire cache line at once: <a href="http://ptspts.blogspot.ca/2014/10/on-speed-of-memset.html" rel="nofollow">http://ptspts.blogspot.ca/2014/10/on-speed-of-memset.html</a>
Python uses a similar trick in its internal allocater (pymalloc) [1]. Given a memory address, pymalloc determines if it controls it by computing what would be the base of the arena containing said address. It then finds the index of the arena (or trash) at a constant offset from the base. It then uses the index to check a vector of all the arenas that it controls.<p>[1]<a href="http://svn.python.org/projects/python/trunk/Misc/README.valgrind" rel="nofollow">http://svn.python.org/projects/python/trunk/Misc/README.valg...</a>
Something in between traditional bit vectors and sparse sets is described here:<p><a href="http://upcoder.com/9/fast-resettable-flag-vector/" rel="nofollow">http://upcoder.com/9/fast-resettable-flag-vector/</a><p>This is designed for a fast clear, but doesn't actually have constant time clear, it's just O(n) clear with very small (amortized) constant. It's a bit more cache friendly, though, I think, and gave us some nice speedups in practice.
I think I have seen this method used in a data structure called something like a 'B.* set' or 'B.* index', but I can't remember the exact name. Does anyone have a clue as to what I am (mis)remembering?<p>Update: I remembered that the algorithm had something to do with checking if an item is not in a set, which quickly led me to Bloom Filter - not particularly similar, as it turns out.
Reading from uninitialized memory in C seems to be a bad idea:<p><a href="http://blog.frama-c.com/index.php?post/2013/03/13/indeterminate-undefined" rel="nofollow">http://blog.frama-c.com/index.php?post/2013/03/13/indetermin...</a>