If you are using MySQL I also recommend Percona's Online Schema Change [1] which is basically a robust implementation of this post's algorithm, handles more edge-cases, has been existing for at least 10 years now and works better out of the box with AWS RDS than Gh-ost. (if you have production experience on 100s of millions rows tables with Gh-ost on RDS please share your experience/tricks, I'd love to hear them!)<p>[1] <a href="https://www.percona.com/doc/percona-toolkit/3.0/pt-online-schema-change.html" rel="nofollow">https://www.percona.com/doc/percona-toolkit/3.0/pt-online-sc...</a>
Something I'm doing in my CV application is using 1 SQLite DB file per user.<p>When a user logs in, a check is done: "Does this user's DB file have the latest migration?" If not, the migrations are applied. That way, you only get a slight delay as a user when a migration is needed. None of the other users their DB files are affected. More technical details are in the FAQ: <a href="https://withoutdistractions.com/cv/faq" rel="nofollow">https://withoutdistractions.com/cv/faq</a><p>In terms of the article: I'm only changing 1 tire at a time at 100mph, not all of them.<p>PS: I recently did a "Show HN" about the app, it got some interesting feedback: <a href="https://news.ycombinator.com/item?id=31246696" rel="nofollow">https://news.ycombinator.com/item?id=31246696</a>
I dunno.. people seem to like 0 downtime migrations, but really, for most companies a little bit of downtime or degraded functionality doesn't matter. Most applications aren't even large enough for the downtime to be more than a couple of minutes.<p>Having some scheduled downtime saves you a lot of complexity of writing, monitoring, and finalizing these migrations. It also makes it a lot easier to have a consistent state in terms of code+data.<p>The article doesn't mention how to deal with different datamodels / constraints etc.
MySQL has some robust tooling in this space. Some of the tools use triggers to copy to a new table. GitHub's gh-ost[1] is probably the state of the art, and uses the binary log stream to replicate the data.<p>[1] <a href="https://github.com/github/gh-ost" rel="nofollow">https://github.com/github/gh-ost</a>
Anybody interested in this subject might also be interested in a tool for Postgres I’ve been working on, Reshape: <a href="https://github.com/fabianlindfors/reshape" rel="nofollow">https://github.com/fabianlindfors/reshape</a>. It aims to fully automate away all the pain and manual steps zero-downtime migrations normally requires :)
> Now the most challenging part: explaining to Product why their seemingly small request took 3x longer than expected. If it helps you can send them this article. Good luck!<p>We had to solve this problem in a way that also took our self-hosted users into account. Essentially "changing Tires at 100mph" in environments we don't control. Still polishing it but will plug a post about it here if anyone finds it relevant:<p><a href="https://posthog.com/blog/async-migrations" rel="nofollow">https://posthog.com/blog/async-migrations</a>
I think PlanetScale folks have blogged well about how schema migration tools work and their traffic splitting, rewinding abilities are very nice.<p><a href="https://docs.planetscale.com/learn/how-online-schema-change-tools-work" rel="nofollow">https://docs.planetscale.com/learn/how-online-schema-change-...</a><p><a href="https://planetscale.com/blog/its-fine-rewind-revert-a-migration-without-losing-data" rel="nofollow">https://planetscale.com/blog/its-fine-rewind-revert-a-migrat...</a>
When I was first starting out writing applications it was very common for the DBA in charge of a database to require all writes to go to a stored procedure and all reads to go through views. It was at times onerous but it made migrations like these a snap. With most migrations/rdms it could even be atomic.
I thought of another trick. Haven"t tried it, should work though. Where applicable put tenants on multi schema. If the tables are now very small in comparison then a write lock is not that important. Simply migrate. Do a blue green deployment between code versions. Done.<p>One downside is the downgrade part. If you still want it you have to do it as in the article.
Side Question - could anyone suggest a tool for db cloning in context of on-commit preview environments?<p>I am only aware of this tool:
<a href="https://github.com/postgres-ai/database-lab-engine" rel="nofollow">https://github.com/postgres-ai/database-lab-engine</a><p>But it looks like too much manual work to do.
Ah, it's about databases... the title here immediately made me think of a much more hardcore "migration" effort: <a href="https://www.amusingplanet.com/2019/10/an-incredible-move-indiana-bell.html" rel="nofollow">https://www.amusingplanet.com/2019/10/an-incredible-move-ind...</a>