Not that I want to advocate for Kubernetes, but this is just a stepping stone for another blog post in a few years "How managed services in AWS got us fired!" detailing how when AWS changed their pricing/strategies the companies had no way out of their design choices.<p>The "vendor agnostic" approach was the right call to make at that point in time. Sure, every business is unique and some cases fit better for a hands-off (lets pay for the convenience of AWS taking care of managed offerings) approach. But, it is a fallacy to think there is no cost to pay for that decision.<p>The cost of operating your vendor agnostic infrastructure is replaced by your team now needing to learn the intricacies of AWS such as IAM, AWS's way of networking, backups etc. Those operational needs dont just go away, they just become "easier" and more defined as AWS's way of doing it.<p>As a consultant, one must know where to draw the line and recommend the appropriate route.
For fun, I run a bare metal k8s cluster with my blog and other projects running on it. My last three nights have been fighting bugs. Bugs with volumes not attaching, nginx magically configuring itself incorrectly, and a whole bunch of other crap. This just magically started happening, but crap like this seems to happen at least once a month. It’s to the point where I spend at least one night a week babysitting the cluster.<p>I don’t have to pay someone else to handle this, but if did, I would get rid of k8s in a heartbeat. I’ve seen a devops team of only a few people manage tens of thousands of traditional servers, but I doubt such a small team could handle a k8s cluster of the same size.<p>I’m considering moving back to traditional architecture for my blog and other projects. K8s has been fun, but there’s too much magic everywhere.
I can confirm that maintaining a Kubernetes cluster is a full time job. Due to its design, there are a lot of moving parts even for the most minimum deployments.<p>Low key I hate touching Google-created projects. On paper technically sound but in practice a guaranteed usability disaster.
Touches a chord this post.Systems like Kubernetes, Kafka are inherently complicated. My previous company got baremetal from AWS and installed k8s cluster on them. No offense to who architected it, we had multi country infra and made sense to take care of cost advantages on lower cloud costs using alternate providers<p>We got a lot of critical infra running on them and then slowly there was tech-debt that would start accumulating. Clusters have to get updated , older DNS versions in k8s are slow, networking (Older Weave versions was bursting through the seams when the traffic exploded with many applications onboarded). SRE teams get overwhelmed, constant requests for adding PVC (Kafka & C* was on k8s) took a toll.
Sanity prevailed in the end, there was decision to move to hosted PaaS infra, though I no longer work there, I just reminisced what we were going through.<p>Though a "cloud-independent" solution will save pennies, it will definitely drown dollars in personnel costs and the uptime/SLA<p>History repeats itself, because we don't learn from our mistakes (us or others)
Very funny article, we are spending 2 people full time (on 4) trying to building on AWS services. This is really a mess and cost a lot of human resources.
Having run k8s and Kafka in a previous job (I left before I got the sack) this article rings completely true.<p>New shop: lambda and eventbridge = life is good.
I am happy to see people are talking about this.<p>Everytime I am trying to point out systems are more and more complicated and do a call for simplicity I am pushed away.<p>I heard you are not taken seriously if you don't use a well established cloud provider or similar things.<p>Truth be told, there are not many projects you do or will work on which need this kind of things.<p>We though cloud will help us with a lot of things but to what cost and by cost I mean stress, data protection, money etc.
This is a good example of people making questionable practical decisions based on good principles. Yes, in principle it'd be good if you were agnostic of your cloud provider. But there's a few issues with it. Firstly, you're going to work <i>so much harder</i> doing what you want to do on AWS whilst avoiding doing the things AWS wants you to do. They're not dumb! You don't want to be locked in? They really want you locked in and they're going to work very hard to make that happen. Secondly, the likelihood of you ever actually making the decision to leave is extremely low, so you're paying all these costs for what is at best a theoretically risk. And finally, even if you do everything perfectly and never depend on anything uniquely amazon.... leaving is still going to suck! It's still going to be a huge amount of work to migrate away!
There seem to be two camps: those who think k8s is a godsend, and those who think it's the devil incarnate. We fall into the former.<p>We run Rancher across a couple of bare metal clusters and it's been mostly an amazing experience (ca. 3 years). The only issues we had were with Rancher specific bugs, but those have been resolved and for the most part our infra is pretty autonomous. We do all HA at the application layer, so local NVME as opposed to network storage. This means Patroni, Redis Sentinel/Cluster, etc. But it broadly just works. Maybe we're not big enough to bump into issues, but I couldn't imagine migrating to the labyrinth of vendor lockin masquerading as cloud services.<p>What am I missing? Why do we have such a wildly different experience to others?
I don't particularly like Kubernetes at all, and while I like Kafka it's definitely overkill for the kind of system discussed in the article. But I gotta say, 87%? What the hell? I had >98% uptime with the first Kafka tooling I ever built, and that was 3 nodes, shared machines with the ZKs, producers/consumers split across three cities, and processing 10x the traffic they're talking about; maintained by <i>just me</i> on medium-range Hetzner boxes.<p>It feels like there's something deeper hiding here, more along the lines of "our developers really don't / can't care about how the software is operating in production."