TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Graviton Database: ZFS for key-value stores

86 点作者 autopoiesis超过 4 年前

12 条评论

aftbit超过 4 年前
&gt;Graviton is currently alpha software.<p>More like the &quot;BTRFS for key-value stores&quot; ;)<p>Kidding aside, I dislike when new unproven software claims the name of industry standards like this. When I saw the headline, I was hoping this somehow actually leveraged ZFS&#x27;s storage layer, but actually it is just a new database that thinks Copy-on-Write is cool.
评论 #24393901 未加载
评论 #24396557 未加载
ysleepy超过 4 年前
Nice!<p>I implemented pretty much the same trade off set in an authenticated storage system.<p>single writer, radix merkle tree, persistent storage, hashed keys, proofs.<p>I guess it is a local maxima within that trade off space.<p>I like how the time travelling&#x2F;history is always touted as a feature (which it is), but it really just means the garbage collector&#x2F;pruning part of the transaction engine is missing. Postgres and other mvcc systems could all be doing this, but they don&#x27;t. The hard part of the feature is being able to turn it off.<p>I&#x27;ll probably have a look around later, the diffing looks interesting, not sure yet if it&#x27;s done using the merkle tree (likely) or some commit walking algorithm.
评论 #24391056 未加载
derefr超过 4 年前
Does anyone know of an embedded key-value store that <i>does</i> do versioning&#x2F;snapshots, but <i>doesn’t</i> bother with cryptographic integrity (and so gets better OLAP performance than a Merkle-tree-based implementation)?<p>My use-case is a system that serves as an OLAP data warehouse of representations of how another system’s state looked at various points in history. You’d open a handle against the store, passing in a snapshot version; and then do OLAP queries against that snapshot.<p>Things that make this a hard problem: The dataset is too large to just store the versions as independent copies; so it really needs <i>some</i> level of data-sharing between the snapshots. But it also needs to be fast for reads, especially whole-bucket reads—it’s an <i>OLAP</i> data warehouse. Merkle-tree-based designs really suck for doing indexed table scans.<p>But, things that can be traded off: there’d only need to be one (trusted) writer, who would just be batch-inserting new snapshots generated by reducing over a CQRS&#x2F;ES event stream. It’d be that (out-of-band) event stream that’d be the canonical, integrity-verified, etc. representation for all this data. These CQRS state-aggregate snapshots would just be a cache. If the whole thing got corrupted, I could just throw it all away and regenerate it from the CQRS&#x2F;ES event stream; or, hopefully, “rewind” the database back to the last-known-good commit (i.e. purge all snapshots above that one) and then regenerate only the rest from the event stream.<p>I’m not personally aware of anything that targets exactly this use case. I’m working on something for it myself right now.<p>Two avenues I’m looking into:<p>• something that acts like a hybrid between LMDB and btrfs (i.e. a B-tree with copy-on-write ref-counted pages shared between snapshots, where those snapshots appear as B-tree nodes themselves)<p>• “keyframe” snapshots as regular independent B-trees, maybe relying on L2ARC-like block-level dedup between them; “interstitial” snapshots as on-disk HAMT ‘overlays’ of the last keyframe B-tree, that share nodes with other on-disk HAMTs, but only within their “generation” (i.e. up to the next keyframe), such that they can all be rewritten&#x2F;compacted&#x2F;finalized once the next keyframe arrives, or maybe even converted into “B-frames” that have forward-references to data embedded in the next keyframe.
评论 #24399663 未加载
评论 #24393495 未加载
评论 #24393494 未加载
评论 #24394115 未加载
评论 #24392440 未加载
moralestapia超过 4 年前
I love the idea but I think you (author) need a lot of time&#x2F;support polishing this. You need a team probably.<p>Also,<p>&gt;Superfast proof generation time of around 1000 proofs per second per core.<p>Does this limit in <i>any</i> way things like read&#x2F;write perfomance or usability in general?
评论 #24392325 未加载
评论 #24391908 未加载
评论 #24391752 未加载
Rochus超过 4 年前
What is the use case? Why is it important that &quot;All keys, values are backed by blake 256 bit checksum&quot;?
评论 #24391050 未加载
评论 #24391245 未加载
bdcravens超过 4 年前
You can run a Graviton database. You can also run a database on a Graviton:<p><a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;about-aws&#x2F;whats-new&#x2F;2020&#x2F;07&#x2F;announcing-preview-for-amazon-rds-m6g-and-r6g-instance-types&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;about-aws&#x2F;whats-new&#x2F;2020&#x2F;07&#x2F;announcin...</a><p>For best results, run Graviton on a Graviton:<p><a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;ec2&#x2F;graviton&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;ec2&#x2F;graviton&#x2F;</a>
评论 #24394642 未加载
TomTinks超过 4 年前
This is definitely something to look into. so far dero looks like a pretty solid project with out of the box thinking.
byteshock超过 4 年前
If latency and performance is an issue there are also solutions like RocksDB or LevelDB
评论 #24391702 未加载
ramoz超过 4 年前
Comparison to Badger? Badger is also go-native and, for me, has been exceptional at scale and for read-heavy workloads on SSD.<p>Ref: <a href="https:&#x2F;&#x2F;github.com&#x2F;dgraph-io&#x2F;badger" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;dgraph-io&#x2F;badger</a>
评论 #24391745 未加载
评论 #24392334 未加载
BryanG2超过 4 年前
Someone paste timing results of diffing for very large data sets.
nickcw超过 4 年前
What I&#x27;d really like is a multiprocess safe embeddable database written in pure Go. So a database which is safe to read and write from separate processes.<p>Unfortunately I don&#x27;t think this one is multiprocess safe.
评论 #24391113 未加载
评论 #24391567 未加载
评论 #24390997 未加载
评论 #24391616 未加载
AtlasBarfed超过 4 年前
... doesn&#x27;t cassandra do a lot of this?
评论 #24391638 未加载