TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Riffle: a high-performance write-once key/value storage engine for Clojure

36 点作者 prospero超过 10 年前

2 条评论

jdp超过 10 年前
This is pretty similar to Sparkey[0] and bam[1]. Sparkey also comes from growing out of cdb&#x27;s limitations. It supports block-level compression like Riffle does, and is optimized for accepting bulk writes. Riffle&#x27;s linear-time merge behavior lifted from Sorted String Tables is a nice alternative to accepting writes at runtime. bam is cool in that it takes a plain separated values file as input, and builds an index file from a minimal perfect hash function over the input file.<p>[0]: <a href="https://github.com/spotify/sparkey" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;spotify&#x2F;sparkey</a> [1]: <a href="https://github.com/StefanKarpinski/bam" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;StefanKarpinski&#x2F;bam</a>
评论 #8621293 未加载
fiatmoney超过 10 年前
&quot;While memory-mapping is used for the hashtable, values are read directly from disk, decoupling our I&#x2F;O throughput from how much memory is available.&quot;<p>Whether you&#x27;re mmap&#x27;ing or using read(), you&#x27;re hitting the page cache before you hit disc, and potentially evicting the LRU page thereof. Glancing through the source it doesn&#x27;t look like they&#x27;re using actual &quot;direct IO&quot; (which, in order to be performant, would have to have its own caching layer).<p>That being the case, for lots of tiny reads &amp; writes I&#x27;d expect mmap to be superior to read() and write().
评论 #8622718 未加载