I would echo the general sentiment in this article for my experiences with bare metal clusters.<p>I started with MicroK8s, and while it's a functional solution for some use-cases, I was genuinely disappointed in it's overall behavior (esp with regards to small node count clusters, in the 3 to 10 node range).<p>The biggest hit was Dqlite - Overall I had a tremendous number of problems that originated explicitly with Dqlite. Everything from unexpected high cpu usage, failure to form consensus with even node counts (esp after a network split), manual configuration files that needed to be deleted or renamed to get specific hosts back into the cluster, and generally poor performance for a long term setup (2 year old cluster stalled to basically a standstill spinning on Dqlite).<p>I have not used Dqlite in other projects, so it's possible this was a Microk8s problem, but based on my experience with Microk8s... I won't touch either of these projects again.<p>I switched away to K3s about 3 years ago now and have had essentially no problems. Considerably fewer random headaches, no unexpected performance degradation, very stable, incredibly pleasant to work with.<p>---<p>I have also migrated about half of my workloads to Longhorn backed PVs at this point (coming from a large shared NAS exposed as NFS) and while I've had a couple more headaches here than with K3s directly - this has been surprisingly smooth sailing as well, while giving me much more flexibility in how I manage my block devices (for context, I'm small - so just under a petabyte of storage, of which ~60% is in use).<p>If you want to run a cluster on hardware you own rather than rent - K3s and Longhorn are amazing tools to do so, and I really have to give Rancher/SUSE a hand here. It's nice tooling.
If Docker Swarm is on the table (it's practically abandonware), I'd like to throw in Nomad by HashiCorp. It's also fairly lightweight, very flexible (can run various types of containers but also exec any executable), decent ecosystem (supports CNI and CSI). It can scale from single node control plane + workloads to tens of thousands of nodes, and the workers can also be geographically spread out (different regions/zones/dcs).<p>A few years back I wrote about it, and most of the core principles of the article are still valid:<p><a href="https://atodorov.me/2021/02/27/why-you-should-take-a-look-at-nomad-before-jumping-on-kubernetes/" rel="nofollow">https://atodorov.me/2021/02/27/why-you-should-take-a-look-at...</a><p>Disclaimer: I work at HashiCorp, but I've had that opinion since before joining and in fact it's among the reasons I joined
K3s is truly great - been using it for years for just about everything not warranting a full cluster. MicroK8s feels like it does too much in non-standard ways and when it breaks you're suddenly dealing with Snap related issues and it's a total immersion break.<p>Lately there's also RKE2 (<a href="https://docs.rke2.io/" rel="nofollow">https://docs.rke2.io/</a>) that I've been growing fondness for and it's only marginally more tricky to setup, with the bonus effect of having a more 'standard' cluster distribution and more knobs to twist.<p>Not that I'd be shy of running K3s in production, but it seems easier to follow 'standard Kubernetes way' for things without having to diff with some of K3s's default configuration choices - which, again, aren't bad at all for folks who do not need all of the different options.<p>For edge workloads and smaller clusters / less familiar operators that want to run Kubernetes platforms themselves without depending on a managed provider, K3s is pretty impossible to beat.
I'd be curious to know what prompt was used to generate this article. I don't mean any offence, most content on the web is generated by LLMs these days one way or another anyway, I'm just curious about the exact prompt used in this case.
This article is centered around Hetzner as a cloud provider but I'm not sure I'd trust Hetzner for anything actually important. It's been known to boot regular, paying customers off the platform without appeal, without prior announcement and without giving proper time to back up data and move the service elsewhere.