I'm a bit shocked to see this on the front page. That's possibly a comment on me as much as anything.<p>But this is just really simple bit packing. Is this something many engineers (maybe frontend/JS) don't know about now?
This is simple binary packing approach to storing small-type data in wider types. Delta compression is a natural next step from here. I once published a blog post on that topic: <a href="http://bitcharmer.blogspot.co.uk/2013/12/how-to-serialise-array-of-doubles-with.html" rel="nofollow">http://bitcharmer.blogspot.co.uk/2013/12/how-to-serialise-ar...</a><p>Mind you, the reference implementation from the blog is really poor in terms of code quality and performance because I was not allowed to open-source what I actually developed for my client, but the ideas still stand.
Delta compression can make a huge difference in some cases; because cpu cycles are cheap and bitwise operations are very fast it has the potential to bring serious benefits in some scenarios, ie. lower latency of transmitting (numeric) data over networks.
Since 3⁵ (243) is less than 256, you can actually fit 5 "trinary" numbers into a byte.<p>The code gets more complex and slow, but not really by much.
Alternatively, just store it naively and gzip.
Or use a sparse array.
Or, best, don't do anything until storage size is an actual, measured bottleneck
See also bitfields, which make the compiler generate the shifting and masking code for you:<p><a href="https://en.wikipedia.org/wiki/Bit_field" rel="nofollow">https://en.wikipedia.org/wiki/Bit_field</a>
And if you need to store values that are not powers of 2, you can use the following kind of encoding/decoding. In python3:<p><pre><code> from math import *
def number_to_trits(n):
# little-endian encoding
# (i.e. smallest precision trit first)
trits = []
v = n
for i in range(1,9):
v, d = divmod(v, 3**i)
trits.append(d)
return trits
print(number_to_trits(8))
# [2, 2, 0, 0, 0, 0, 0, 0]
def trits_to_number(trits):
n = 0
for i, trit in enumerate(trits):
n += trit*3**i
return n
print(trits_to_number([2,2]))
# 8</code></pre>
I also really liked this page on bitpacking, which also offers a technique to implement them using only math operations.<p><a href="http://number-none.com/product/Packing%20Integers/index.html" rel="nofollow">http://number-none.com/product/Packing%20Integers/index.html</a><p>I recently implemented a bitpacker in a game VM that didn't have bitwise operations, and the math-based version worked perfectly.
Isn't it possible to use SQLite or something for something like this? That would greatly simplify most of these tasks, making your dataset reachable through a fast declarative language.<p>As many games of Go as 100000 is, it's not so big you couldn't use a regular, untuned database.
This method is only optimal if your numbers have ranges that fit nicely into powers of two. See here for optimal bit packing: <a href="https://codeplea.com/optimal-bit-packing" rel="nofollow">https://codeplea.com/optimal-bit-packing</a>
Neat :)<p>I wrote some very slow code to set/get the Nth bit in an array of bytes.<p>I was planning on using it for Chess bitboards - <a href="https://en.wikipedia.org/wiki/Bitboard" rel="nofollow">https://en.wikipedia.org/wiki/Bitboard</a>, but it could be used for say 8 booleans to the byte.<p>unsigned char getbit(unsigned char * bits,unsigned long n){<p><pre><code> return (bits[n/8] & (unsigned char)pow(2,n%8)) >> n%8; </code></pre>
}<p>void setbit(unsigned char * bits,unsigned long n, unsigned char val){<p><pre><code> bits[n/8] = (bits[n/8] & ~(unsigned char)pow(2,n%8)) | ((unsigned char)pow(2,n%8) * val);
}</code></pre>