TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Save Redis memory with large list compression

33 点作者 glebm超过 11 年前

3 条评论

ihsw超过 11 年前
Tumblr made a blog post[1] about exploiting similar compression techniques.<p>Using that technique, personally I&#x27;ve had great success storing data in hashes by JSON-encoding it beforehand, where it would normally be like so:<p><pre><code> HSET user:159784623 username ihsw HSET user:159784623 email blahblah@gmail.com HSET user:159784623 created_at 1377986411 </code></pre> But instead it&#x27;s like so:<p><pre><code> HSET user:156039 687 {&quot;username&quot;:&quot;ihsw&quot;,&quot;email&quot;:&quot;blahblah@gmail.com&quot;,&quot;created_at&quot;: 1377986411} </code></pre> Where we divide data into &quot;buckets&quot; that are 1024 in size and given an ID of 159784623, the resulting bucket ID is 156039 and the remainder is 687.<p><pre><code> id = 159784623 bucket_size = 1024 remainder = id % bucket_id bucket_id = (id - remainder) &#x2F; bucket_size </code></pre> Using this I&#x27;ve been able to reduce memory usage anywhere from 40%-80% (yes, 80%), which depends compressibility of the data (length and randomness of each hash item).<p>I&#x27;ve also been replacing dictionary keys with integers and it further reduces the size of the data being stored by an additional ~30%.<p><pre><code> HSET user:156039 687 {&quot;0&quot;:&quot;ihsw&quot;,&quot;1&quot;:&quot;blahblah@gmail.com&quot;,&quot;2&quot;:&quot;1377986411} </code></pre> It shouldn&#x27;t be underestimated how these simple techniques can make such a significant impact, especially when the gains are quite considerable. JSON-encoded data may be quite verbose, so CSV may add considerable gains too, but JSON can accommodate missing keys.<p>Lists and sets can also accommodate &quot;bucketing&quot; of data, however that comes with added complexity of accommodating the variety of redis commands that come with those data structures (BLPOP, SADD, SDIFF, etc).<p>[1] <a href="http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs" rel="nofollow">http:&#x2F;&#x2F;instagram-engineering.tumblr.com&#x2F;post&#x2F;12202313862&#x2F;sto...</a>
评论 #6308172 未加载
评论 #6308079 未加载
neuroscr超过 11 年前
It&#x27;s not clear if the wallclock savings was consistent over multiple runs.<p>And by moving one look-up out of redis into ruby, doesn&#x27;t seem like the right thing to do. That increases complexity by now requires an application layer to process the data source.<p>I&#x27;d like to see how this compares with just simply increasing list-max-ziplist-entries.
评论 #6308270 未加载
DrJosiah超过 11 年前
Redis doesn&#x27;t compress contents of lists (or hashes or zsets) in a &quot;ziplist&quot;, it packs them in a concise manner that omits structure overhead.<p>Source: the source code of Redis itself <a href="https://github.com/antirez/redis/blob/unstable/src/ziplist.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;antirez&#x2F;redis&#x2F;blob&#x2F;unstable&#x2F;src&#x2F;ziplist.c</a> , Redis documentation: <a href="http://redis.io/topics/memory-optimization" rel="nofollow">http:&#x2F;&#x2F;redis.io&#x2F;topics&#x2F;memory-optimization</a> , and&#x2F;or my book: <a href="http://bitly.com/redis-in-action" rel="nofollow">http:&#x2F;&#x2F;bitly.com&#x2F;redis-in-action</a> .
评论 #6309856 未加载