Turning SQLite into a Distributed Database

347 pointsby losfairalmost 3 years ago

17 comments

monstradoalmost 3 years ago

This is exactly what the engineers behind FoundationDB (FDB) wanted when they open sourced. For those who don't know, FDB provides a transactional (and distributed) ordered key-value store with a somewhat simple but very powerful API.Their vision was to build the hardest parts of building a database, such as transactions, fault-tolerance, high-availability, elastic scaling, etc. This would free users to build higher-level (Layers) APIs [1] / libraries [2] on top.The beauty of these layers is that you can basically remove doubt about the correctness of data once it leaves the layer. FoundationDB is one of the most (if not the) most tested [3] databases out there. I used it for over 4 years in high write / read production environments and never once did we second guess our decision.I could see this project renamed to simply "fdb-sqlite-layer"[1] <a href="https://github.com/FoundationDB/fdb-document-layer" rel="nofollow">https://github.com/FoundationDB/fdb-document-layer</a>[2] <a href="https://github.com/FoundationDB/fdb-record-layer" rel="nofollow">https://github.com/FoundationDB/fdb-record-layer</a>[3] <a href="https://www.youtube.com/watch?v=OJb8A6h9jQQ" rel="nofollow">https://www.youtube.com/watch?v=OJb8A6h9jQQ</a>

评论 #32541653 未加载

评论 #32542152 未加载

alberthalmost 3 years ago

Don’t forget BedrockDB (built on SQLite) that’s used in production at Expensify.How it scales as well.<a href="https://bedrockdb.com/" rel="nofollow">https://bedrockdb.com/</a><a href="https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-qps-on-a-single-server/" rel="nofollow">https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...</a>

评论 #32545760 未加载

评论 #32541603 未加载

yencabulatoralmost 3 years ago

Relevant reading about FoundationDB building a SQL database on top of a distributed key-value store: <a href="https://www.voltactivedata.com/blog/2015/04/foundationdbs-lesson-fast-key-value-store-not-enough/" rel="nofollow">https://www.voltactivedata.com/blog/2015/04/foundationdbs-le...</a> (That one replaced SQLite's btree, this one puts pages of the btree as values in the key-value store.)Another approach using FUSE, making arbitrary SQLite-using applications leader-replica style distributed for HA: <a href="https://github.com/superfly/litefs" rel="nofollow">https://github.com/superfly/litefs</a> (see also <a href="https://litestream.io/" rel="nofollow">https://litestream.io/</a> for WAL-streaming backups, that's the foundation of this)

phamiltonalmost 3 years ago

If I understand this correctly, it's similar in design to AWS Aurora or GCP AlloyDB.The underlying storage provides the distributed primitives and the DB itself just reads and writes to it.Like Aurora, some tweaks to the engine were required, but the core query engine is largely intact.Has anyone seen postgresql on FoundationDB? Is there anything unique about SQLite that makes it better suited for this approach? One thing that comes to mind is how they were able to do block level locking instead of full db locking. That took some tweaking but probably significantly less than postgresql might require.

samsquirealmost 3 years ago

This is interesting..I would like to learn more how they implemented MVCC.I wrote single machine MVCC in Java and I'm curious if there are other ways of implementing it. One way is event sourcing.I use an integer to store the latest commit version and I only allow transactions to see versions less than the transaction's timestamp. This is the multiversion part.The concurrency control part is enforced by checking if the read timestamp of the key is less than the reading transaction timestamp, if so someone got there before us and we abort and restart.I am thinking how to build the distributed part.I need a timestamp server the same way Google's Spanner needs TrueTime for monotonic timestamps but also some way of broadcasting read timestamps to detect conflicts between nodes. So I'm thinking of broadcasting timestamp events and using that to detect transactions that have dangerous dependencies.

评论 #32540170 未加载

评论 #32541684 未加载

hardwaresoftonalmost 3 years ago

So I just fell down the rabbit hole of figuring out how to use SQLite with Ceph (turns out a thing called libcephsqlite[0][1] exists) -- awesome to see this new take on distributed SQLite.The caveats for dqlite and rqlite always felt kind of awkward/risky to me -- in stark contrast to SQLite which is so stable/"built in" that you don't think about it's failure modes. Having to worry about what exactly I ran (ex. RANDOM()) was just a non-starter (IIRC rqlite has this problem but not dqlite? or the other way around -- one replicates at statement level the other at WAL level).That said though, the biggest sticking point with all this SQLite goodness is how to make sure that certain libraries (any popular extension -- vsv, spatialite, libcephsqlite) were loaded for any application using SQLite -- there seem to be only a few options:- calling load_extension[2] from code (this is somewhat frowned upon, but maybe it's fine)- LD_PRELOAD (mvsqlite does this)- Building your own SQLite and swapping out shared libs (mvqslite also does this, because statically compiled sqlite is a nuisance)- Trapping/catching calls to dlopen (also basically requires LD_PRELOAD, but I guess you could go custom kernel or whatever)This is probably the one big wart of SQLite -- it's a bit difficult to pull in new interesting extensions.I also found this hack[3] which looks quite interesting for building something more general/reusable...[EDIT] - Also while I'm here, I think FDB is probably one of the most under-rated massive-scale NoSQL databases right now. It gets nearly no press (to be fair because it went closed then open again), but it's casually a massive force behind Apple's services at scale.[0]: <a href="https://docs.ceph.com/en/latest/rados/api/libcephsqlite/" rel="nofollow">https://docs.ceph.com/en/latest/rados/api/libcephsqlite/</a>[1]: <a href="https://github.com/rook/rook/issues/10689" rel="nofollow">https://github.com/rook/rook/issues/10689</a>[2]: <a href="https://www.sqlite.org/lang_corefunc.html#load_extension" rel="nofollow">https://www.sqlite.org/lang_corefunc.html#load_extension</a>[3]: <a href="https://github.com/cventers/sqlite3-preload" rel="nofollow">https://github.com/cventers/sqlite3-preload</a>

评论 #32544179 未加载

medvalmost 3 years ago

Also <a href="https://github.com/rqlite/rqlite" rel="nofollow">https://github.com/rqlite/rqlite</a>It’s supper cool as it does change sqlite.

conradevalmost 3 years ago

I really like the idea of using FUSE as SQLite backend, rather than injecting into the processI love the use case of querying SQLite from a CDN with range requests, because it allows for real “serverless” querying. For example, this: <a href="https://github.com/psanford/sqlite3vfshttp" rel="nofollow">https://github.com/psanford/sqlite3vfshttp</a>

评论 #32540122 未加载

nik736almost 3 years ago

What makes this special compared to rqlite or dqlite? Edit: <a href="https://github.com/losfair/mvsqlite/wiki/Comparison-with-dqlite-and-rqlite" rel="nofollow">https://github.com/losfair/mvsqlite/wiki/Comparison-with-dql...</a>

评论 #32540234 未加载

评论 #32540192 未加载

formerly_provenalmost 3 years ago

Speaking of "turning SQLite into things it wasn't really meant for", does anyone know of a compressing time-series layer for SQLite?

评论 #32543953 未加载

baqalmost 3 years ago

Sounds good in theory and if it really works as a drop in replacement it’s amazing tech, but numbers, please! I can make a beefy Postgres server handle 5-10-20k tps relatively easily, what can I expect from this?

评论 #32541634 未加载

评论 #32540153 未加载

endisneighalmost 3 years ago

I’m interested in how FoundationDB can make anything consistent.For example: you’re using Postgres. You send the request to FDB, it will ensure all Postgres transactions are consistent or tell Postgres to abort transactions.

lifeisstillgoodalmost 3 years ago

I am fascinated by how the initial authors of all these packages went from "that would neat if I had 3 months free" to "yeah, doing it now"

EGregalmost 3 years ago

How does this compare to, say, CockroachDB?

评论 #32541748 未加载

didipalmost 3 years ago

I am surprised that SQLite can be plugged into FDB just like that.

abujazaralmost 3 years ago

But why?

solarkraftalmost 3 years ago

I've seen a lot of Sqlite hype here in the past weeks and months.Please excuse my ignorance, but what do all these Sqlite-but-make-it-X solutions offer over a simple, more established solution like (nobody ever got fired for choosing) Postgres?

评论 #32540791 未加载

评论 #32540561 未加载

评论 #32540445 未加载

评论 #32540439 未加载