Shardines: SQLite3 Database-per-Tenant with ActiveRecord

255 pointsby julik24 days ago

19 comments

kgeist24 days ago

We use "database-per-tenant" with ~1 mln users.Database-per-tenant is great for read-heavy apps - most tenants are small and their tables don't have a lot of records, so even very complex joins are very fast.The main problem is that release times can increase substantially, because you have to migrate lots of individual databases one by one, and sometimes schema/data drift can occur between DBs, and then your release stops midway and you have to figure out why in some tenants the feature works, and in some it breaks... So it's not free launch, the complexity just shifts to a different place.

评论 #43812182 未加载

评论 #43812230 未加载

评论 #43813068 未加载

评论 #43814038 未加载

评论 #43812364 未加载

评论 #43812578 未加载

评论 #43812687 未加载

hinkley24 days ago

I really like SQLite but I wonder if what we are seeing here is a need for conventional OLTP databases to be able to unload parts of indexes from memory.Because essentially what’s happening with database per user is that we don’t keep anything in memory for users who are inactive, or only active on another instance, for essentially the same schema (or maybe a couple versions of a schema).This is starting to smell like the JSON situation with Mongo. Where Postgres is now twice as fast as Mongo at the thing Mongo is supposed to be an expert in.

评论 #43815872 未加载

评论 #43815417 未加载

mrits24 days ago

When data can be this isolated from each other and you don't have any scaling issues within a single tenant it's pretty hard to make a wrong design choice. Almost anything will work.

Lord_Zero24 days ago

I would say most people don't need a database per tenant and that is definitely not the norm. There are specific cases that you would need to negate the drawbacks such as migrations and schema drift.I think the lack of gems/libraries/patterns is proof of this. Just because you can doesn't mean you should.Not saying there's no reason to ever do it, proceed with caution and know for a fact you need db per tenant.

评论 #43815228 未加载

napsterbr24 days ago

Coincidentally I'm working on FeebDB[0], which is similar but for Elixir instead. It can be seen as a replacement to Ecto (which won't work well when you have thousands of databases).Mostly as a fun experiment, but also from the realization that every place I worked at in the past (small/medium-sized B2B startups) would greatly benefit from such architecture.Yes, there are massive trade-offs to this approach, and the concerns raised in the comment section are valid. This doesn't mean the database-per-tenant is never worth it. There's a sweet spot for it, and if it fits your business/application, I personally would consider it a technical advantage over competitors.My goal with FeebDB is to eliminate or reduce the common pain points of database-per-tenant, including:- ensure there is a single writer per database.- improved connection management across all tenants (e.g. only keep open at most 1000 DB connections, similar to an LRU cache).- on-demand migration (all shards are migrated on application startup, but if a shard that hasn't migrated yet receives a request, it will first perform the migration and then serve the request),- on-demand backups and replication (e.g. the library knows which shards were updated in the last X minutes, so it can trigger Litestream or similar on demand).- support for enumeration of databases (performing map/reduce/filter operations across multiple DBs)- support for clustered deployment with "pinned" tenants (for now I'm assuming the IOPS of a single beefy server should be enough for all use cases, but once that's no longer sufficient you can have "shards of shards")[0] - <a href="https://github.com/renatomassaro/FeebDB/">https://github.com/renatomassaro/FeebDB/</a>

shauntest1232124 days ago

Awesome stuff, Forward Email does similar with an encrypted sqlite db per mailbox / per user. Great way to differentiate protection per user.

devnull324 days ago

I think an in-between approach could be:1. Identify top-N tenants2. Separate the DB for these tenantsThe top-N could be based on mix of IOPS, importance (revenue wise), etc.The data model should be designed in such a way that from rows pertaining to each tenant can be extracted.

评论 #43814059 未加载

评论 #43814165 未加载

huhtenberg24 days ago

If noshing else, the name is exshellent. Very Sean Connery.

评论 #43815883 未加载

Mystery-Machine24 days ago

I use `site_id` in every model and keep everything in a single database. I think this is how Shopify started as well. You can see that because their product, variant, collection IDs are incremental across all shops. They might be using different solution today, but it feels like this is how they started.What I like about having everything in one db, until it grows too big, is that I can do cross-users analytics/reporting. I also had it happen, like it was mentioned in the article, that I needed to share some data between users. Having everything in a single database made this much simpler problem than needing to move data between databases.

评论 #43813466 未加载

andrewstuart24 days ago

Database per tenant can get real nasty when schemas get out of sync, then you’re in real pain.Postgres with row based access control is a much better solution to database per tenant/strong isolation.

评论 #43814177 未加载

评论 #43814079 未加载

评论 #43820056 未加载

evolve2k22 days ago

Fascinating article documenting a clear challenge over multiple versions of Rails. My read on the subtext of the article is that these issues of proper SQLite support within Active Record won’t fix with a clever gem or quick script but rather need dedicated effort in the rails core.What surprises me a little is that the article is written just now in April 2025 but makes no mention of recent developments in rails.Specially I’m referring to the Rails 8 release. My take away from Rails 8 is that one of the biggest features is significant work towards making SQLite very much a first class citizen of rails, with significant focus on enhancin support for it across the stack.The text below is from the announcement: “Rails 8 No PASS required”“Getting SQLite ready for productionOn top of the trifecta of Solid adapters that makes it possible for SQLite to power Action Cable, Rails.cache, and Active Job, a bunch of work has gone into making the SQLite adapter and Ruby driver suitable for real production use in Rails 8. At 37signals, we’re building a growing suite of apps that use SQLite in production with ONCE. There are now thousands of installations of both Campfire and Writebook running in the wild that all run SQLite. This has meant a lot of real-world pressure on ensuring that Rails (and Ruby) is working that wonderful file-based database as well as it can be.”<a href="https://rubyonrails.org/2024/11/7/rails-8-no-paas-required" rel="nofollow">https://rubyonrails.org/2024/11/7/rails-8-no-paas-required</a>Surely this work bears fruit for the challenges the developer speaks of here. Curious as to if anyone has further insight into what extent Rails 8 addresses the highlighted shortcomings and further if there is a significant gap what would be needed to have this resolved for Rails 9 which I expect is underway.

breckenedge24 days ago

> And, of course, the “database per tenant” workflow is just startingJames Edward Gray spoke about doing this at RailsConf in 2012. <a href="https://youtu.be/GRfJ9lni4QA?si=BoDI5gQ8GvM1PDhw" rel="nofollow">https://youtu.be/GRfJ9lni4QA?si=BoDI5gQ8GvM1PDhw</a>

haarts24 days ago

I used something similar in the past. Loved it. User wants their data? Bang! Here's the entire database. User deletes their account? `rm username.sql` and you're done. Compliance was never easier.

bsaul24 days ago

wonder if active record couldn’t borrow some pattern from coredata with its managedobjectcontext property associated to every object. This lets you work with objects transparently, not having to carry a « db » property everywhere, and the framework always knowing where the object come from, it can trigger an error if you’re trying to do something unholy by mixing them.

评论 #43811922 未加载

scosman24 days ago

What’s the disk behind this strategy? Lots of sqllite DBs is easy until you have lots of servers, no?

评论 #43813453 未加载

pushcx24 days ago

Could anyone who runs Rails with sqlite in production share some scale numbers like r/w per second, # rows in the db, and vps size? I have used it at trivial scale for SolidQueue and SolidCache, but I can't find experience reports for the primary OLTP db.

codaphiliac24 days ago

Wonder how many sqlite databases would be too many. At one point I assume not all databases can be kept opened at all time. what sort of overhead would there be serving a tenant not opened up yet? there has to be caches etc. not warmed up causing lots of disk IO

评论 #43814123 未加载

评论 #43812925 未加载

评论 #43812939 未加载

DidYaWipe24 days ago

Love the name.That is all.

评论 #43815889 未加载

rcarmo24 days ago

Came here for the amazing title pun. Like the approach, was a bit sad it's about Ruby but won't judge.

评论 #43812957 未加载

评论 #43812825 未加载