I don't think the upsides are worth all the work.<p>You can spend a lot of time getting databases and other stateful workloads to work -- mess around with StatefulSet and PVC on top of all the normal Kubernetes concepts, and what do you get in the end? Are you really better off than you would have been if you ran the database in EC2?<p>Plus, "herds not pets" kind of breaks down once you start using StatefulSets and PVCs. Those things exist to make Kubernetes more like a static environment for workloads that can't handle being run like ephemeral cattle. So why not just keep using your static environment?<p>If Kubernetes is the only workload management control plane you have, then I guess this makes sense. But if you are already able to deploy your databases with existing tools, and those existing tools don't <i>really</i> suck, it's probably not worth migrating. It would take a lot of time and introduce significant new risks and operational complexity without a compensating payoff.
I've been quite happy with CloudNativePG on k8s. It was simple for me to set up on a k8s cluster with one primary and two replicas, if the primary box goes down another instance becomes primary, deal with connection pooling, and simple to have backups go to a cloud object store. The alternative is dealing with all the replication manually, making sure that your leader election and failover work, making sure you can stand up new PG instances and get things replicated to the new instance, having a service that is checking the health of the database to trigger a failover, etc. It's certainly not impossible or anything like that, but CloudNativePG has been pretty easy. K8s isn't perfect or anything, but it's been a pretty nice experience for me.<p>I've tried other Postgres operators and been disappointed and it did require a little learning, but it's not like getting replication, Patroni, etcd, PGBouncer, HAProxy, and pgBackRest all running for a high-availability Postgres deployment is easy and wouldn't require learning.<p>As the author says, "[k8s's] operator model allows end users to programmatically manage their workloads by writing code against the core k8s APIs to automatically perform tasks that would previously have to be done manually." To me, that's the benefit. The operator can handle tasks like adding a replica or failing over the primary to one of the replicas. I could presumably do some of that with other tools on bare metal/VMs (I can always shell-script things), but I've had a good experience with CloudNativePG's operator. Likewise, as the author says, making day-2 operations easier is a big thing.<p>K8s does have some annoying amount of complexity, but it's been nice overall.
That's just a really really bad write-up on the real problem on running a database on k8s.<p>You need ha because k8s should run already with automatic node upgrades.<p>You need a pod disruption budget to make sure it is running and switching over when a node fails or gets upgraded.<p>You want to either totally Oberprovision on memory or look into keep 2400 to make sure to fine-tune memory before k8s starts to throw your database out constantly.<p>K8s is not a VM.<p>If you use k8s and still don't take care of application migration strategies you still don't understand what cloud native means.<p>There are still other things missing here but still...<p>Of course excluding hobby people playing with k8s.<p>Memory and upgrading nodes are the two single most issues will see which disrupts service.<p>Otherwise k8s is a dream come true.<p>I still would try to use a db managed if it's critical.<p>Additional points:
Zalando postgres operator is great and shows the real magic of k8s and operator.<p>Use a helm chart and just bring your own little database for dev test and e2e tests.<p>You can easily use Auto scaling for node profiles. No noisy neighbors. If your db is too small for normal nodes you don't have a problem anyway.
The key for me is the level of automation that you can reach at a reasonable "development cost". Let me elaborate.<p>K8s, if anything, is an API. An API that allows you to interact with compute, storage and networks in a way that is abstracted from the actual underlying infrastructure. This is incredibly powerful. You can, essentially, code and automate all your infrastructure.<p>But this goes beyond deployment, something you could achieve (more or less) with tools like Terraform or Pulumi. Enter "Day 2 operations".<p>Day 2 operations are essential for any database. And cloud services have done a good job at automating them. Speaking of Postgres, my daily job, things like HA, backups but also minor and major version upgrades are table stakes day 2 operations.<p>If you want to build these day 2 operations in the cloud (say on VMs), even though you have APIs do to so, a) they don't implement a pattern like Kubernete's reconciliation cycle; and b) you have a distinct API per cloud. K8s solves both problems, making it way "cheaper" to build such an automation. On K8s, a given operator can code these day 2 operations against K8s APIs. Therefore, if you want to build such automation, either you are a cloud provider (and potentially do this only for your own cloud) or you do it on Kubernetes.<p>This is so much true, that existing operators have already gone beyond what DBaaS do. Speaking of StackGres [0] (disclaimer: founder), we have implemented day 2 operations (other than the "table stakes" ones that I mentioned before) that no other DBaaS offers as of today, such as vacuums, repacks and even benchmarks (and more day 2 operations will be developed). See [1] for the CRD specs of SGDbOps, our "Day 2 operations" if you are interested.<p>[0] <a href="https://stackgres.io" rel="nofollow">https://stackgres.io</a>
[1] <a href="https://stackgres.io/doc/latest/reference/crd/sgdbops/" rel="nofollow">https://stackgres.io/doc/latest/reference/crd/sgdbops/</a>
This article does a great job describing the investment required to pull this off. At HubSpot, my team is running a large Vitess/MySql deployment (500+ distinct databases some sharded, multi region) atop k8s today and had to learn a lot of those same lessons and primitives. We opted to write our own operator(s) to do it. In the end, the investment has paid off in terms of being able to build self service functionality for the rest of the business and write the kinds of tools workflows that allow us to support it with a relatively small team. The value is in the operator pattern itself and being able to manipulate things on a common control plane. Compared to the alternative of managing this with Terraform and Puppet/Ansible/Chef directly on EC2, which I've also done before, it's a better experience and much more maintainable even at the fixed expense of additional training and tooling.<p>I won't disagree with others that RDS is probably worth it until you need something very specific or have reached a certain scale.<p>Happy to share tips or pointers for anyone going down this path specifically with MySql or database workloads in general.
I guess there must be a usecase in missing here, but RDS is working so well for me, it's hard to imagine why I would not shift most of the operational concerns to this competent vendor.<p>The only thing I can think of is cost. My usage probably isn't high enough where there is any financial benefit to an alternative... but if it was, maybe I'd be considering this.
Hi, author here! Over the past 6 months, I've been building a hosted service for a database on top of k8s at QuestDB, and wanted to share some of my thoughts on the topic. I was inspired by the recent twitter discussion led by Kelsey Hightower a few weeks ago. Hope you find it interesting!
I used to work for an org that deployed 3rd party legaltech "apps" on kubernetes which had all batteries included - Postgres, rabbitmq, redis, you name it I have seen it. Running statefulset even with the best operator there with a team of 4 is nothing short of a nightmare. Couple this with stability of rook ceph.<p>In 2019, every operator had crazy bugs, we inherited all of them. You have to solve not just databases level error but also errors popping from operators. If you can avoid databases on kubernetes, you should just do it.
I've recently worked with putting postgres into kubernetes using the zalando operator. The impression has been such a mixed bag that it looks like we need to start over with some other operator. When we run into problems the documentation, error messages and configuration structure has been quite cryptic.<p>Does anyone have any specific recommendations on what to use (like which operator) when setting up a postgres cluster on k8s, specifically for standby replication?
Not to diminish the product that QuestDB is working on, but another solution that works very well with Kubernetes is Vitess. Vitess is basically sharded MySQL, but it automatically manages this very well and has built in kubernetes support so it really handles the "pets to cattle" thing well.
> K8s has an extensible Operator pattern that you can use to manage your own Custom Resources (CRs) by writing and deploying a controller<p>I have seen it fail way too many times. Inspecting a failing deployment that now has some magic Go code someone wrote running on this cluster. I can see using the basic kube building blocks: deployments, pods, config maps, etc.; there are enough guides and tools to help you out. As soon as you start writing code that runs in there, you're now dealing with two problems: your actual thing you're deploying, and now the operator.<p>Well, and then you need a mesh, and a way to manage certificates. and if it's a database to manage all the volumes. Everything looks good at the architect level - all the boxes and arrows line up, but when it breaks in production it's a nightmare to debug.
StatefulSet and PVCs aren’t sufficient to fully handle all the likely resilience challenges of running a database cluster on K8S. There needs to be some rethinking on how StatefulSet works to make it more appropriate to this use case, such as allowing Pods to be started out of order when recovering from failures.<p>I worked in this problem space extensively until 2020, and I think that there are paths forward but they require changes in K8S that none of the folks involved seem motivated to make. Realistically to make databases in K8S work well today you need a database built for K8S rather than one adapted for K8S.<p>The building blocks present today are not fundamentally capable of building a positive UX for adapting existing databases to K8S, but this is something that is worth making possible and I hope the community gets there some day.
As a relative newcomer to k8s I was a bit surprised at the lack of backup tools available, coming from the world of on-prem Veeam which had more features than I knew what to do with. In my current role we had to find a way to back up our Postgres DBs running on k8s. We started using Kanister to actually take the backups but found there wasn't much around to actually manage the backups' lifecycle. I ended up writing Taweret (<a href="https://github.com/swissDataScienceCenter/taweret">https://github.com/swissDataScienceCenter/taweret</a>), a small tool which just ends up interacting with the Kanister CRDs to delete backups we no longer require based on a defined backups strategy.
We ran Zalando Operator for Postgres in k8s for a year, until finally succumbing to its technical debt that leaks out from every bit of its software being.<p>After switching to the Chrunchy Data pg operator v5 on k8s, we've had close to zero problems - one or two times a year the log shipping / HA replication fails and we have to restart it, but it's really neat! I can *warmly* recommend it; it really is CloudSQL in K8S.