Does anyone know of an embedded key-value store that <i>does</i> do versioning/snapshots, but <i>doesn’t</i> bother with cryptographic integrity (and so gets better OLAP performance than a Merkle-tree-based implementation)?<p>My use-case is a system that serves as an OLAP data warehouse of representations of how another system’s state looked at various points in history. You’d open a handle against the store, passing in a snapshot version; and then do OLAP queries against that snapshot.<p>Things that make this a hard problem: The dataset is too large to just store the versions as independent copies; so it really needs <i>some</i> level of data-sharing between the snapshots. But it also needs to be fast for reads, especially whole-bucket reads—it’s an <i>OLAP</i> data warehouse. Merkle-tree-based designs really suck for doing indexed table scans.<p>But, things that can be traded off: there’d only need to be one (trusted) writer, who would just be batch-inserting new snapshots generated by reducing over a CQRS/ES event stream. It’d be that (out-of-band) event stream that’d be the canonical, integrity-verified, etc. representation for all this data. These CQRS state-aggregate snapshots would just be a cache. If the whole thing got corrupted, I could just throw it all away and regenerate it from the CQRS/ES event stream; or, hopefully, “rewind” the database back to the last-known-good commit (i.e. purge all snapshots above that one) and then regenerate only the rest from the event stream.<p>I’m not personally aware of anything that targets exactly this use case. I’m working on something for it myself right now.<p>Two avenues I’m looking into:<p>• something that acts like a hybrid between LMDB and btrfs (i.e. a B-tree with copy-on-write ref-counted pages shared between snapshots, where those snapshots appear as B-tree nodes themselves)<p>• “keyframe” snapshots as regular independent B-trees, maybe relying on L2ARC-like block-level dedup between them; “interstitial” snapshots as on-disk HAMT ‘overlays’ of the last keyframe B-tree, that share nodes with other on-disk HAMTs, but only within their “generation” (i.e. up to the next keyframe), such that they can all be rewritten/compacted/finalized once the next keyframe arrives, or maybe even converted into “B-frames” that have forward-references to data embedded in the next keyframe.