TechEcho

12 comments

aftbitover 4 years ago

>Graviton is currently alpha software.More like the "BTRFS for key-value stores" ;)Kidding aside, I dislike when new unproven software claims the name of industry standards like this. When I saw the headline, I was hoping this somehow actually leveraged ZFS's storage layer, but actually it is just a new database that thinks Copy-on-Write is cool.

评论 #24393901 未加载

评论 #24396557 未加载

ysleepyover 4 years ago

Nice!I implemented pretty much the same trade off set in an authenticated storage system.single writer, radix merkle tree, persistent storage, hashed keys, proofs.I guess it is a local maxima within that trade off space.I like how the time travelling/history is always touted as a feature (which it is), but it really just means the garbage collector/pruning part of the transaction engine is missing. Postgres and other mvcc systems could all be doing this, but they don't. The hard part of the feature is being able to turn it off.I'll probably have a look around later, the diffing looks interesting, not sure yet if it's done using the merkle tree (likely) or some commit walking algorithm.

评论 #24391056 未加载

derefrover 4 years ago

Does anyone know of an embedded key-value store that does do versioning/snapshots, but doesn’t bother with cryptographic integrity (and so gets better OLAP performance than a Merkle-tree-based implementation)?My use-case is a system that serves as an OLAP data warehouse of representations of how another system’s state looked at various points in history. You’d open a handle against the store, passing in a snapshot version; and then do OLAP queries against that snapshot.Things that make this a hard problem: The dataset is too large to just store the versions as independent copies; so it really needs some level of data-sharing between the snapshots. But it also needs to be fast for reads, especially whole-bucket reads—it’s an OLAP data warehouse. Merkle-tree-based designs really suck for doing indexed table scans.But, things that can be traded off: there’d only need to be one (trusted) writer, who would just be batch-inserting new snapshots generated by reducing over a CQRS/ES event stream. It’d be that (out-of-band) event stream that’d be the canonical, integrity-verified, etc. representation for all this data. These CQRS state-aggregate snapshots would just be a cache. If the whole thing got corrupted, I could just throw it all away and regenerate it from the CQRS/ES event stream; or, hopefully, “rewind” the database back to the last-known-good commit (i.e. purge all snapshots above that one) and then regenerate only the rest from the event stream.I’m not personally aware of anything that targets exactly this use case. I’m working on something for it myself right now.Two avenues I’m looking into:• something that acts like a hybrid between LMDB and btrfs (i.e. a B-tree with copy-on-write ref-counted pages shared between snapshots, where those snapshots appear as B-tree nodes themselves)• “keyframe” snapshots as regular independent B-trees, maybe relying on L2ARC-like block-level dedup between them; “interstitial” snapshots as on-disk HAMT ‘overlays’ of the last keyframe B-tree, that share nodes with other on-disk HAMTs, but only within their “generation” (i.e. up to the next keyframe), such that they can all be rewritten/compacted/finalized once the next keyframe arrives, or maybe even converted into “B-frames” that have forward-references to data embedded in the next keyframe.

评论 #24399663 未加载

评论 #24393495 未加载

评论 #24393494 未加载

评论 #24394115 未加载

评论 #24392440 未加载

moralestapiaover 4 years ago

I love the idea but I think you (author) need a lot of time/support polishing this. You need a team probably.Also,>Superfast proof generation time of around 1000 proofs per second per core.Does this limit in any way things like read/write perfomance or usability in general?

评论 #24392325 未加载

评论 #24391908 未加载

评论 #24391752 未加载

Rochusover 4 years ago

What is the use case? Why is it important that "All keys, values are backed by blake 256 bit checksum"?

评论 #24391050 未加载

评论 #24391245 未加载

bdcravensover 4 years ago

You can run a Graviton database. You can also run a database on a Graviton:<a href="https://aws.amazon.com/about-aws/whats-new/2020/07/announcing-preview-for-amazon-rds-m6g-and-r6g-instance-types/" rel="nofollow">https://aws.amazon.com/about-aws/whats-new/2020/07/announcin...</a>For best results, run Graviton on a Graviton:<a href="https://aws.amazon.com/ec2/graviton/" rel="nofollow">https://aws.amazon.com/ec2/graviton/</a>

评论 #24394642 未加载

TomTinksover 4 years ago

This is definitely something to look into. so far dero looks like a pretty solid project with out of the box thinking.

byteshockover 4 years ago

If latency and performance is an issue there are also solutions like RocksDB or LevelDB

评论 #24391702 未加载

ramozover 4 years ago

Comparison to Badger? Badger is also go-native and, for me, has been exceptional at scale and for read-heavy workloads on SSD.Ref: <a href="https://github.com/dgraph-io/badger" rel="nofollow">https://github.com/dgraph-io/badger</a>

评论 #24391745 未加载

评论 #24392334 未加载

BryanG2over 4 years ago

Someone paste timing results of diffing for very large data sets.

nickcwover 4 years ago

What I'd really like is a multiprocess safe embeddable database written in pure Go. So a database which is safe to read and write from separate processes.Unfortunately I don't think this one is multiprocess safe.

评论 #24391113 未加载

评论 #24391567 未加载

评论 #24390997 未加载

评论 #24391616 未加载

AtlasBarfedover 4 years ago

... doesn't cassandra do a lot of this?

评论 #24391638 未加载

12 comments

aftbitover 4 years ago

评论 #24393901 未加载

评论 #24396557 未加载

ysleepyover 4 years ago

评论 #24391056 未加载

derefrover 4 years ago

评论 #24399663 未加载

评论 #24393495 未加载

评论 #24393494 未加载

评论 #24394115 未加载

评论 #24392440 未加载

moralestapiaover 4 years ago

评论 #24392325 未加载

评论 #24391908 未加载

评论 #24391752 未加载

Rochusover 4 years ago

What is the use case? Why is it important that "All keys, values are backed by blake 256 bit checksum"?

评论 #24391050 未加载

评论 #24391245 未加载

bdcravensover 4 years ago

评论 #24394642 未加载

TomTinksover 4 years ago

This is definitely something to look into. so far dero looks like a pretty solid project with out of the box thinking.

byteshockover 4 years ago

If latency and performance is an issue there are also solutions like RocksDB or LevelDB

Graviton Database: ZFS for key-value stores

12 comments

Graviton Database: ZFS for key-value stores

12 comments