Fossil[1] is a SCM system (like git) created by the very same author of SQLite (D. Richard Hipp). It uses SQLite as its database and implements versioning and branching[2] and even merging (which LiteTree doesn't do) on its own, by recording the changes on each item on a separate table.<p>This approach is more complex to implement but a lot more versatile and flexible. Most of times you wouldn't want to version or branch the whole database, but only parts of it.<p>[1] <a href="https://www.fossil-scm.org" rel="nofollow">https://www.fossil-scm.org</a><p>[2] <a href="https://www.fossil-scm.org/index.html/doc/trunk/www/branching.wiki" rel="nofollow">https://www.fossil-scm.org/index.html/doc/trunk/www/branchin...</a>
Thanks for posting this. My first thought was - has this been sent through the official SQLite battery of tests? If so, have the tests been adapted to validate branches, rapid branch switches, branching under failure conditions (malloc fails, power outages, etc) and concurrent access patterns?<p>One of the reasons why SQLite is so widely used is that it is carefully tested and shown to be reliable even in potentially faulty conditions. As detailed on <a href="https://sqlite.org/testing.html" rel="nofollow">https://sqlite.org/testing.html</a>, there are three test sets, one of which is public (the TCL set). I’d love to see test results to assure the safety of any data stored in LiteTree.
> LiteTree is more than TWICE AS FAST than normal SQLite on Linux and MacOSX!!!<p>In my experience, claims like these usually end up showing that the author didn't understand the `PRAGMA synchronous` setting at all, or they chose to ignore it to juice their stats.<p>In this benchmarking test are the data durability guarantees the same for both LiteTree and vanilla SQLite?
Neat. I'll have to compare this to my own implementation.<p><a href="https://github.com/cannadayr/git-sqlite" rel="nofollow">https://github.com/cannadayr/git-sqlite</a><p>Instead of storing the transactions as a separate lmdb commit, I decided to store the database in a git repository and expose the diffs using sqlite's sqldiff utility. This allowed my workflow to be almost unchanged and limits the dependencies to git, sqlite, sqldiff, & bash.
There has been earlier work on getting git-style branched versioning on top of databases. For relational databases, OrpheusDB (<a href="http://orpheus-db.github.io/" rel="nofollow">http://orpheus-db.github.io/</a>) puts a layer over PostgreSQL. They also supply a gRPC layer for interacting with the server.<p>For key-value systems, there are simple techniques for adding branched versioning to key-value (particularly ordered key-value) stores. We are using it for our research dataservice that holds 25+ TB of Connectomics data, which includes 3d image and segmentation data (<a href="http://dvid.io" rel="nofollow">http://dvid.io</a>). Our paper is currently under review but should have been out several years ago :) We can use a variety of key-value storage backends and are experimenting with versioned relational DBs, so I'll definitely give LiteTree a look.
Is the function similar to PostgreSQL's deprecated "Time Travel" <a href="https://www.postgresql.org/docs/6.3/static/c0503.htm" rel="nofollow">https://www.postgresql.org/docs/6.3/static/c0503.htm</a> ?<p>AFAIK this can be a foundation for some form of Snapshot Isolation <a href="https://www.sqliteconcepts.org/SI_index.html" rel="nofollow">https://www.sqliteconcepts.org/SI_index.html</a> (?)
If merge gets supported than it could serve as an alternative for program development -- using tables to store function definitions, constants, etc. instead of using flat files.
I am looking for exactly for this kind of implementation for my work project - having a DB using version control model.<p>However I need a production ready solution.<p>There is also:
<a href="https://github.com/attic-labs/noms" rel="nofollow">https://github.com/attic-labs/noms</a>
But the project does not seem mature enough.<p>Do you know if there is any way to achieve this with an aim for production? What would be the best way/stack to get this result with current available tools?
I'm looking at this will little knowledge of how this makes the blockchain application easier. What seems odd to me is that merging branches isn't supported? So you can't perform a bunch of "transactions" and then merge them back into your master state. Maybe someone could illuminate the purpose this solves a little more clearly, as I'm guessing it has <i>nothing</i> to do with my naive understanding.
Very interesting stuff!<p>Is it possible to see a history of a column, table, schema, etc? Is it possible to tag a certain point in time?<p>It would be liberating for many schema designs that we could just change stuff and be sure that the database knew what was changed and when with the ability to roll changes back.
Looking at the README it's not clear how indexes are managed.
Like when we create a branch and add some data to an existing table and move back to a previous branch and try to add data with the same index keys ?
Interesting, I implemented something similar a long time ago, have to see if I can dig up the source code. The goal was to support forking data without duplicating unchanged data.
Interesting. The branches could solve the "date-effective" table designs. In the past I had used Git as a database to store multiple versions of a document efficiently.<p>Or this could be used as some elementary partitioning logic where each branch is effectively a partition.
The use case seems to overlap with noms dB - <a href="https://github.com/attic-labs/noms" rel="nofollow">https://github.com/attic-labs/noms</a><p>Noms doesn't have the appeal of SQL, but it is versioned and forkable and strongly typed data.
This is interesting and I hope I can find a use case for it. However, the performance compared to vanilla SQLite makes me anxious that there is a trade-off elsewhere, such as crash integrity.
> LiteTree is implemented storing the SQLite db pages on LMDB.<p>Why are you doing it like that? Does it lead to some limitation of some sort? Like making merge very costly?
Cool project thanks for sharing your work. There's an older project using lmdb (which doesn't support branching or anything, just for storage)...is litetree's usage of lmdb comparable to what sqlightning does? How does litetree work with the write-ahead log? How do multiple concurrent connections interact? Are multiple writers allowed? Can readers and writer(s) coexist?