Hello HN! I'm building mvsqlite, a distributed variant of SQLite with MVCC transactions, that runs on FoundationDB. It is a drop-in replacement that just needs an `LD_PRELOAD` for existing applications using SQLite.<p>I made this because Blueboat (<a href="https://github.com/losfair/blueboat" rel="nofollow">https://github.com/losfair/blueboat</a>) needs a native SQL interface to persistent data. Apparently, just providing a transactional key-value store isn’t enough - it is more easy and efficient to build complex business logic on an SQL database, and it seems necessary to bring a self-hostable distributed SQL DB onto the platform. Since FoundationDB is Blueboat’s only stateful external dependency, I decided to build the SQL capabilities on top of it.<p>At its core, mvsqlite’s storage engine, mvstore, is a multi-version page store built on FoundationDB. It addresses the duration and size limits (5 secs, 10 MB) of FDB transactions, by handling multi-versioning itself. Pages are fully versioned, so they are always snapshot-readable in the future. An SQLite transaction fetches the read version during `BEGIN TRANSACTION`, and this version is used as the per-page range scan upper bound in future page read requests.<p>For writes, pages are first written to a content-addressed store keyed by the page's hash. At commit, hashes of each written page in the SQLite transaction is written to the page index in a single FDB transaction to preserve atomicity. With 8K pages and ~60B per key-value entry in the page index, each SQLite transaction can be as large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).<p>mvsqlite is not yet "production-ready", since it hasn’t received enough testing, and I may still have a few changes to make to the on-disk format. But please ask here if you have any questions!
I love the idea of distributed SQLite but I’m having a hard time understanding which parts of FoundationDB and which parts of SQLite are available in this implementation.<p>I’m guessing virtual table extensions work with this since you’re just replacing the storage engine? So we could in theory use FTS5 and even OSQuery and other extensions right?<p>However since this is using FoundationDB I’m also guessing we can’t use this as a serverless embedded DB since since you’ll probably need a foundation db cluster to use this. Is that right?<p>So if I understand correctly this is a SQLite query engine on top of FoundationDB with distributed transactions and we can theoretically use SQLite ecosystem stuff like FTS5 and datasette on top of it.
One thing I've been curious about with FDB (need to find time to try this myself) is using FDB as a way to easily implement replication with consistency.<p>For example: You have 5 Postgres instances. You send the query "SELECT * FROM TABLE" to FDB, you want the result of this from <i>any</i> of the 5 Postgres's (first to return wins). When you insert, you want to insert into all 5 and make sure that all 5 have actually finished the transaction before telling the client.<p>Seems simple enough to implement via FDB?
This really, IMHO (as someone implements things on top of SQLite too <a href="https://dflat.io" rel="nofollow">https://dflat.io</a>) pushes SQLite too far as the implementation of cross-db transactions have some big issues: <a href="https://www.sqlite.org/limits.html" rel="nofollow">https://www.sqlite.org/limits.html</a> (the number of attached databases cannot exceed 10 or 125 (if you compile your own)) <a href="https://www.sqlite.org/wal.html" rel="nofollow">https://www.sqlite.org/wal.html</a> (in WAL mode, there is no transactional guarantee for cross database transactions (atomic per database, but not cross database))
So the idea is that a small business could start on SQLite, and then switch over to this when it's time to scale, without re-writing it in the Postgres dialect?<p>Regardless, it's very very cool. Would love to see it get turned into a product.
If you've written your own multi-versioning, what does FDB bring to the table that you couldn't have gotten out of other distributed but non-transactional KV stores? E.g. Cassandra, etc. Isn't there an overhead to the MVCC aspect of FDB? And it sounds like you've had to jump hoops around things to get past its duration and size limits, as well...
> But a group of N sqlite databases is an N-writer database. And mvsqlite provides the necessary mechanisms to do serializable cross-database transactions without additional overhead.<p>I'm confused, are these databases planned to be replicated? Or is it expected for the databases to have separate schemas?
This sounds really cool, do you have any source code in a workable state yet or is this project still in the formulative ideation phase?<p>If there's anything concrete so far, I'd love to take a look and/or try it out!
Good work! I don't understand the innards of this at all, but I love the design.<p>Iirc when I first read through the book designing data centric applications, the author talked about a lot of trade-offs for data storage and replication and network connectivity issues.<p>the impression I left with was for my particular application that foundation DB was the best option I had for my wild dreams of web scale popularity.<p>the current data persistence layer I use is sqlite, which means if I use mvsqlite, that only makes it easier for me to try to use foundation DB for my someday no doubt irresistible web application.