TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Introduction to LSM Trees: May the Logs Be with You

128 pointsby priyankvexabout 6 years ago

3 comments

DocSavageabout 6 years ago
I found this older introduction to be pretty good and part of a series: <a href="https:&#x2F;&#x2F;medium.com&#x2F;databasss&#x2F;on-disk-io-part-3-lsm-trees-8b2da218496f" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;databasss&#x2F;on-disk-io-part-3-lsm-trees-8b2...</a><p>Besides the use of LSM trees in RocksDB and leveldb-like databases, there is also the WiscKey approach (<a href="https:&#x2F;&#x2F;www.usenix.org&#x2F;node&#x2F;194425" rel="nofollow">https:&#x2F;&#x2F;www.usenix.org&#x2F;node&#x2F;194425</a>) that helps read&#x2F;write amplification by keeping the LSM tree small and mostly removing the values to a separate value log. There&#x27;s a pure Go implementation of the WiscKey approach used by dgraph: <a href="https:&#x2F;&#x2F;github.com&#x2F;dgraph-io&#x2F;badger" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;dgraph-io&#x2F;badger</a>
评论 #19771723 未加载
评论 #19771931 未加载
lichtenbergerabout 6 years ago
If you just need to fetch values by a key, for the main storage (might be even as simple as generated by a sequence generator) you can even avoid the asynchronous background compaction overhead and thus unpredictable read- or write-peaks and so on by hashing the keys if it&#x27;s not already an integer&#x2F;long based identifier: Basically storing a persistent (both on-disk persistence as well as in the functional sense immutable) hash array based trie. This can easily be extended to store a new revision through copy-on-write semantics. Instead of storing whole page snapshots however, storage advanced now permit fine granular access to your data. Thus you can basically apply lessons learned from backup systems to version the data pages itself and even improve on that.<p>Disclaimer: I&#x27;m the author of a recent article about a free open source storage system I&#x27;m maintaining, which versions data at it&#x27;s very core: &quot;Why Copy-on-Write Semantics and Node-Level-Versioning are key to Efficient Snapshots&quot;: <a href="https:&#x2F;&#x2F;hackernoon.com&#x2F;sirix-io-why-copy-on-write-semantics-and-node-level-versioning-are-key-to-efficient-snapshots-754ba834d3bb" rel="nofollow">https:&#x2F;&#x2F;hackernoon.com&#x2F;sirix-io-why-copy-on-write-semantics-...</a>
评论 #19771231 未加载
评论 #19771262 未加载
评论 #19771174 未加载
webshit155about 6 years ago
I don&#x27;t understand the hash index part. I guess that for every segment on disk, you also have a hash table for it, correct? Also, this part doesn&#x27;t seem to be very memory-efficient