This is an excellent, well explained article. For most of the small companies I work for, database hosting is treated like a utility so we end up using RDS etc. The benefit of arranging your own k8 database setup is outweighed by the cost of maintenance. On one hand it's nice that it's one less thing to worry about, on the other I miss this kind of deep nerdery.
The truth is Kubernetes already has all the primitives to run database or any other stateful applications. What is challenging is how to make it reliable, performant and scalable for this kind of apps. If you manage to do it properly, then it will be far less costly and manageable than any DBaaS out there.
There are 2 dimensions to the problem:
1) How to easily deploy and operate a "production" DB in K8S. As the toolset evolves, we've seen a lot of different operators and customization tools to facilitate this. Today it is rather trivial to write the manifests that will deploy a Redis cluster fit to your needs.
2) You need a REALLY good kubernetes-native data service layer that guarantees fast failover, sync replication and performance for your persistent volumes. StatefulSets alone won't really help in case of node failure...and don't even think about using NFS. Ondat/StorageOS are leaders in that space, it's really simple to deploy, so you can give it a try in less than 5min.
Once you've made sure that your stateful configuration is right and that the network/data layer is reliable, then you can leverage all the visibility tools native to K8S (ELK/EFK, Prometheus, etc). Your stateful apps will be ready for CI and other DevOps integration using the K8S tools (Tekton, ArgoCD etc). This will make your devs happy as infra is not the blocker anymore...
While I am happy for the progress people make in bringing databases to container orchestrators, I don't see myself going off the dedicated virtual machine for the database. I find it easier/safer to run it this way.<p>Shameless plug: if you want to start running your own PostgreSQL with SSL, SELinux, automatic system updates, attached storage, and more... I have a simple demo as part of <a href="https://deploymentfromscratch.com/" rel="nofollow">https://deploymentfromscratch.com/</a>.
Running anything which depends on disk cache 'working' should not be run in a shared environment, unless your application can run arbitrarily slowly.<p>If I understand how it works, allocating memory to a container does not allocate memory for caching of data, and any IO done by any container can eject data from the cache, which means any IO intensive process will step on any other.
I always thought it's better to host distributed database, like Cockroachdb and Yugabytedb, in multiple kubernetes clusters. Just in case one of the cluster goes down.
Best way to run database is in static pods. You'll get all the benefit of kubernetes ecosystem (monitoring, logs, inventory, access control) without any drawbacks.
what's the minimum resource requirements for a minimal 1 micro-sized table, and very low traffic? Is it in the magnitude of 100s of MB and tens of a vcore ? or .. ?
cant think of a scenario where you need to query multiple databases in same query!
Nice demo, but deploying it all in a cluster is distracting. Maybe do your next demo with docker-compose!