I really wish Kubernetes would make it's storage backend pluggable or the k3s folks would push their work to allow a SQL database as the backed upstream. Then you could just back Kubernetes with some Cloud SQL offering.
One of the failure conditions is having a new follower read data off of the leader as it bootstraps, which adds extra load on the system.<p>It seems like a follower could pull the initial snapshot off of another follower to start instead?
Note that the etcd project ignored this report of a data loss/corruption bug on MacOS:<p><a href="https://github.com/etcd-io/bbolt/issues/124" rel="nofollow">https://github.com/etcd-io/bbolt/issues/124</a>
> <i>For instance, a flaky (or rejoining) member drops in and out, and starts campaign. This member ends up with higher terms, ignores all incoming messages with lower terms, and sends out messages with higher terms. When the leader receives this message of a higher term, it reverts back to follower. This becomes more disruptive when there’s a network partition.</i><p>I'm glad this has been fixed, considering that one of the use cases for using partition-tolerant data stores is to tolerate partitions.<p>Cloud Foundry used earlier versions of etcd and this category of problem was the leading cause of severe outages. To the point that several years of effort were invested to tear it out of everything and replace it with bog-ordinary RDBMSes.<p>Disclosure: I work for Pivotal, we did a lot of that work, but I wasn't at the front line. Just watching from a safe distance.