Bare-Metal Kubernetes, Part I: Talos on Hetzner

214 pointsby MathiasPiusover 1 year ago

13 comments

I recently rebuilt my Kubernetes cluster running across three dedicated servers hosted by Hetzner and decided to document the process. It turned into a (so far) 8-part series covering everything from bootstrapping and firewalls to setting up persistent storage with Ceph.Part I: Talos on Hetzner <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-1-talos-on-hetzner/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-1-talo...</a>Part II: Cilium CNI & Firewalls <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-2-cilium-and-firewalls/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-2-cili...</a>Part III: Encrypted GitOps with FluxCD <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-3-encrypted-gitops-with-fluxcd/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-3-encr...</a>Part IV: Ingress, DNS and Certificates <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-4-ingress-dns-certificates/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-4-ingr...</a>Part V: Scaling Out <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-5-scaling-out/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-5-scal...</a>Part VI: Persistent Storage with Rook Ceph <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-6-persistent-storage-with-rook-ceph/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-6-pers...</a>Part VII: Private Registry with Harbor <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-7-private-registry-with-harbor/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-7-priv...</a>Part VIII: Containerizing our Work Environment <a href="https://datavirke.dk/posts/bare-metal-kubernetes-part-8-containerizing-our-work-environment/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-part-8-cont...</a>And of course, when it all falls apart: Bare-metal Kubernetes: First Incident <a href="https://datavirke.dk/posts/bare-metal-kubernetes-first-incident/" rel="nofollow noreferrer">https://datavirke.dk/posts/bare-metal-kubernetes-first-incid...</a>Source code repository (set up in Part III) for node configuration and deployed services is available at <a href="https://github.com/MathiasPius/kronform">https://github.com/MathiasPius/kronform</a>While the documentation was initially intended more as a future reference for myself as well as a log of decisions made, and why I made them, I've received some really good feedback and ideas already, and figured it might be interesting to the hacker community :)

评论 #37449651 未加载

评论 #37446341 未加载

评论 #37444456 未加载

评论 #37447412 未加载

评论 #37445795 未加载

mythzover 1 year ago

Thankfully we've never had the need for such complexity and are happy with our current GitHub Actions > Docker Compose > GCR > SSH solution [1] we're using to deploy 50+ Docker Containers.Requires no infrastructure dependencies, stateless deployment scripts checked into the same Repo as Project and after GitHub Organization is setup (4 secrets) and deployment server has Docker compose + nginx-proxy installed, deploying an App only requires 1 GitHub Action Secret, as such it doesn't get any simpler for us and we'll look to continue to use this approach for as long as we can.[1] <a href="https://servicestack.net/posts/kubernetes_not_required" rel="nofollow noreferrer">https://servicestack.net/posts/kubernetes_not_required</a>

评论 #37448629 未加载

CoolColdover 1 year ago

> Ceph is designed to host truly massive amounts of data, and generally becomes safer and more performant the more nodes and disks you have to spread your data across.I'm very pessimistic on CEPH usage in the scenario you have - may be I've missed it, but seen nothing about upgrading networking, as by default you gonna have 1Gbit on single interface used for public network/internal vSwitch.Even by your benchmarks, write test is 19 iops (block size is huge though)<pre><code> Max bandwidth (MB/sec): 92 Min bandwidth (MB/sec): 40 Average IOPS: 19 Stddev IOPS: 2.62722 Max IOPS: 23 Min IOPS: 10 </code></pre> while single HDD drive would give ~ 120 iops. single 3 years old NVMe datacenter edition, gives ~ 33000 iops with 4k block + fdatasync=1CEPH would be very limiting factor in 1Gbit networking I believe - I'd put clear disclaimer on that for fellow sysadmins.P.S. The amount of work you done is huge and appreciated.

sureglymopover 1 year ago

Here's what I don't really get.. So, let's say you have three hosts and create your cluster. But now, you still need a reverse proxy or load balancer in front right? I mean not inside the cluster but to route requests to nodes of the cluster that are not currently down. So you could set up something like HAProxy on another host. But now you once again have a single point of failure. So do you replicate that part also and use DNS to make sure one of the reverse proxies is used? Maybe I'm just misunderstanding how it works but multiple nodes in a cluster still need some sort of central entry point right? So what is the correct way to do this.

评论 #37454256 未加载

评论 #37454048 未加载

wg0over 1 year ago

I've come to the conclusion (after trying kops, kubespray, kubeadm, kubeone, GKE, EKS) that if you're looking for < 100 node cluster, docker swarm should suffice. Easier to setup, maintain and upgrade.Docker swarm is to Kubernetes what SQLite is to PostgreSQL. To some extent.

评论 #37445767 未加载

评论 #37446264 未加载

评论 #37445251 未加载

评论 #37446313 未加载

评论 #37447219 未加载

评论 #37446484 未加载

评论 #37445927 未加载

评论 #37450717 未加载

评论 #37449152 未加载

评论 #37445489 未加载

评论 #37445257 未加载

评论 #37447031 未加载

InvaderFizzover 1 year ago

I'm going through you series now. Very well done.I thought I would mention that age is now built in to SOPS, thus needs no external dependencies and is faster and easier than gpg.

评论 #37444613 未加载

xelxebarover 1 year ago

Speaking of k8s, anyone here know of ready-made solutions for getting XCode (i.e. xcodebuild) running in pods? As far as I'm aware, there are no good solutions for getting XCode running on Linux, so at the moment I'm just futzing about with a virtual-kubelet[0] implementation that spawns MacOS VMs. This works just fine, but the problem seems like such an obvious one that I expect there to be some existing solution(s) I just missed.[0]:<a href="https://github.com/virtual-kubelet/virtual-kubelet/">https://github.com/virtual-kubelet/virtual-kubelet/</a>

评论 #37446261 未加载

评论 #37446202 未加载

dhessover 1 year ago

What performance numbers are you seeing on pods with Ceph PVs? e.g., what does `rados bench` give?

评论 #37448425 未加载

评论 #37445950 未加载

wiktor-kover 1 year ago

Very nice write-up!I wonder if it's possible to combine the custom ISO with cloud init [0] to automate the initial node installation?[0]: <a href="https://github.com/tech-otaku/hetzner-cloud-init">https://github.com/tech-otaku/hetzner-cloud-init</a>

评论 #37444582 未加载

dave-at-koorover 1 year ago

Great post. We (Koor) have been going through something similar to create a demo environment for Rook-Ceph. In our case, we want to show different types of data storage (block, object, file) in a production-like system, albeit at the smaller end of scale.Our system is hosted at Hetzner on Ubuntu. KubeOne does the provisioning, backed by Terraform. We are using Calico for networking, and we have our own Rook operator.What would have made the Rook-Ceph experience better for you?

lemperover 1 year ago

I thought it was about talos the power9 system. intrigued by kubernetes on them.

评论 #37446674 未加载

mulmenover 1 year ago

Just finished reading part one and wow, what an excellently written and presented post. This is exactly the series I needed to get started with Kubernetes in earnest. It’s like it was written for me personally. Thanks for the submission MathiasPius!

mkageniusover 1 year ago

From this, if people get the idea that they should get a Bare Metal on Hetzner and try. Don't. They will reject you probably, they are very picky.And if you are from a developing country like India, don't even think about it.