科技回声

3 条评论

ihsw超过 11 年前

Tumblr made a blog post[1] about exploiting similar compression techniques.Using that technique, personally I've had great success storing data in hashes by JSON-encoding it beforehand, where it would normally be like so:<pre><code> HSET user:159784623 username ihsw HSET user:159784623 email blahblah@gmail.com HSET user:159784623 created_at 1377986411 </code></pre> But instead it's like so:<pre><code> HSET user:156039 687 {"username":"ihsw","email":"blahblah@gmail.com","created_at": 1377986411} </code></pre> Where we divide data into "buckets" that are 1024 in size and given an ID of 159784623, the resulting bucket ID is 156039 and the remainder is 687.<pre><code> id = 159784623 bucket_size = 1024 remainder = id % bucket_id bucket_id = (id - remainder) / bucket_size </code></pre> Using this I've been able to reduce memory usage anywhere from 40%-80% (yes, 80%), which depends compressibility of the data (length and randomness of each hash item).I've also been replacing dictionary keys with integers and it further reduces the size of the data being stored by an additional ~30%.<pre><code> HSET user:156039 687 {"0":"ihsw","1":"blahblah@gmail.com","2":"1377986411} </code></pre> It shouldn't be underestimated how these simple techniques can make such a significant impact, especially when the gains are quite considerable. JSON-encoded data may be quite verbose, so CSV may add considerable gains too, but JSON can accommodate missing keys.Lists and sets can also accommodate "bucketing" of data, however that comes with added complexity of accommodating the variety of redis commands that come with those data structures (BLPOP, SADD, SDIFF, etc).[1] <a href="http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs" rel="nofollow">http://instagram-engineering.tumblr.com/post/12202313862/sto...</a>

评论 #6308172 未加载

评论 #6308079 未加载

neuroscr超过 11 年前

It's not clear if the wallclock savings was consistent over multiple runs.And by moving one look-up out of redis into ruby, doesn't seem like the right thing to do. That increases complexity by now requires an application layer to process the data source.I'd like to see how this compares with just simply increasing list-max-ziplist-entries.

评论 #6308270 未加载

DrJosiah超过 11 年前

Redis doesn't compress contents of lists (or hashes or zsets) in a "ziplist", it packs them in a concise manner that omits structure overhead.Source: the source code of Redis itself <a href="https://github.com/antirez/redis/blob/unstable/src/ziplist.c" rel="nofollow">https://github.com/antirez/redis/blob/unstable/src/ziplist.c</a> , Redis documentation: <a href="http://redis.io/topics/memory-optimization" rel="nofollow">http://redis.io/topics/memory-optimization</a> , and/or my book: <a href="http://bitly.com/redis-in-action" rel="nofollow">http://bitly.com/redis-in-action</a> .

Save Redis memory with large list compression

3 条评论

Save Redis memory with large list compression

3 条评论