Show HN: Distributed SQLite on FoundationDB

150 pointsby losfairalmost 3 years ago

Hello HN! I'm building mvsqlite, a distributed variant of SQLite with MVCC transactions, that runs on FoundationDB. It is a drop-in replacement that just needs an `LD_PRELOAD` for existing applications using SQLite.I made this because Blueboat (<a href="https://github.com/losfair/blueboat" rel="nofollow">https://github.com/losfair/blueboat</a>) needs a native SQL interface to persistent data. Apparently, just providing a transactional key-value store isn’t enough - it is more easy and efficient to build complex business logic on an SQL database, and it seems necessary to bring a self-hostable distributed SQL DB onto the platform. Since FoundationDB is Blueboat’s only stateful external dependency, I decided to build the SQL capabilities on top of it.At its core, mvsqlite’s storage engine, mvstore, is a multi-version page store built on FoundationDB. It addresses the duration and size limits (5 secs, 10 MB) of FDB transactions, by handling multi-versioning itself. Pages are fully versioned, so they are always snapshot-readable in the future. An SQLite transaction fetches the read version during `BEGIN TRANSACTION`, and this version is used as the per-page range scan upper bound in future page read requests.For writes, pages are first written to a content-addressed store keyed by the page's hash. At commit, hashes of each written page in the SQLite transaction is written to the page index in a single FDB transaction to preserve atomicity. With 8K pages and ~60B per key-value entry in the page index, each SQLite transaction can be as large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).mvsqlite is not yet "production-ready", since it hasn’t received enough testing, and I may still have a few changes to make to the on-disk format. But please ask here if you have any questions!

15 comments

faizshahalmost 3 years ago

I love the idea of distributed SQLite but I’m having a hard time understanding which parts of FoundationDB and which parts of SQLite are available in this implementation.I’m guessing virtual table extensions work with this since you’re just replacing the storage engine? So we could in theory use FTS5 and even OSQuery and other extensions right?However since this is using FoundationDB I’m also guessing we can’t use this as a serverless embedded DB since since you’ll probably need a foundation db cluster to use this. Is that right?So if I understand correctly this is a SQLite query engine on top of FoundationDB with distributed transactions and we can theoretically use SQLite ecosystem stuff like FTS5 and datasette on top of it.

评论 #32289697 未加载

评论 #32292515 未加载

endisneighalmost 3 years ago

One thing I've been curious about with FDB (need to find time to try this myself) is using FDB as a way to easily implement replication with consistency.For example: You have 5 Postgres instances. You send the query "SELECT * FROM TABLE" to FDB, you want the result of this from any of the 5 Postgres's (first to return wins). When you insert, you want to insert into all 5 and make sure that all 5 have actually finished the transaction before telling the client.Seems simple enough to implement via FDB?

评论 #32293365 未加载

dilyevskyalmost 3 years ago

I think foundationdb uses sqlite as its tablet kv engine? It’s sqlite all the way down

评论 #32292698 未加载

mpingalmost 3 years ago

FDB is a nice piece of technology if you know how to go around its constraints. Congrats on the project.

liuliualmost 3 years ago

This really, IMHO (as someone implements things on top of SQLite too <a href="https://dflat.io" rel="nofollow">https://dflat.io</a>) pushes SQLite too far as the implementation of cross-db transactions have some big issues: <a href="https://www.sqlite.org/limits.html" rel="nofollow">https://www.sqlite.org/limits.html</a> (the number of attached databases cannot exceed 10 or 125 (if you compile your own)) <a href="https://www.sqlite.org/wal.html" rel="nofollow">https://www.sqlite.org/wal.html</a> (in WAL mode, there is no transactional guarantee for cross database transactions (atomic per database, but not cross database))

评论 #32291990 未加载

评论 #32291859 未加载

LAC-Techalmost 3 years ago

So the idea is that a small business could start on SQLite, and then switch over to this when it's time to scale, without re-writing it in the Postgres dialect?Regardless, it's very very cool. Would love to see it get turned into a product.

cmrdporcupinealmost 3 years ago

If you've written your own multi-versioning, what does FDB bring to the table that you couldn't have gotten out of other distributed but non-transactional KV stores? E.g. Cassandra, etc. Isn't there an overhead to the MVCC aspect of FDB? And it sounds like you've had to jump hoops around things to get past its duration and size limits, as well...

评论 #32290858 未加载

learndeeplyalmost 3 years ago

> But a group of N sqlite databases is an N-writer database. And mvsqlite provides the necessary mechanisms to do serializable cross-database transactions without additional overhead.I'm confused, are these databases planned to be replicated? Or is it expected for the databases to have separate schemas?

评论 #32289847 未加载

metadatalmost 3 years ago

This sounds really cool, do you have any source code in a workable state yet or is this project still in the formulative ideation phase?If there's anything concrete so far, I'd love to take a look and/or try it out!

评论 #32289891 未加载

gigatexalalmost 3 years ago

Any chance this could get Jepsen tested? I’d donate to make that happen.

Multicompalmost 3 years ago

Good work! I don't understand the innards of this at all, but I love the design.Iirc when I first read through the book designing data centric applications, the author talked about a lot of trade-offs for data storage and replication and network connectivity issues.the impression I left with was for my particular application that foundation DB was the best option I had for my wild dreams of web scale popularity.the current data persistence layer I use is sqlite, which means if I use mvsqlite, that only makes it easier for me to try to use foundation DB for my someday no doubt irresistible web application.

ranjanprjalmost 3 years ago

I think this is a great idea, and probably sqlite bigger sibling PostgreSQL adopts it one day.

fnord123almost 3 years ago

Comdb2: distributed sqlite: <a href="http://comdb2.org/" rel="nofollow">http://comdb2.org/</a>

评论 #32290625 未加载

reichardtalmost 3 years ago

Does this work similar to rqlite or dqlite from a usage standpoint, or does it solve a different use case?

Something1234almost 3 years ago

Haven't you heard of bedrock by expensify?

评论 #32292651 未加载