TechEcho

3 comments

The article talks about a bunch of features around canary routing that Kubernetes does not have. All of these features are misfeatures. They are ingrained because it used to be easiest to deploy that way - you deployed to your Australian data center and so all your Australian users got the new version, and that was a good enough sample. Kubernetes makes deploying a canary easy, so deploy a canary everywhere.<p>Likewise, it complains that "10% canary is hard" because you need to scale the deployments in sync. This is a problem with your monitoring layer - Your alerts and triggers and thresholds should all be based on proportion of traffic, not the other way around. Your graphs should scale to the number of canary instances. Every canary metric should either be an absolute, or a proportion of traffic.<p>In short: Your canary should look as close to normal traffic as possible. There should be no appreciable differences between the traffic going to your canaries, and the traffic going to the rest of your production deployment. Your canary is a production deployment and any deviance needs to either be intentional or is a sign of defect.

评论 #20108861 未加载

评论 #20108918 未加载

jbergstroemalmost 6 years ago

For me, the missing piece of canary deployments was how to handle gradual traffic changes ("confidence building" if you may). I now handle this with flagger (<a href="https://docs.flagger.app/" rel="nofollow">https://docs.flagger.app/</a>) and Istio.

labrabbitalmost 6 years ago

Glasnostic CEO here. These are all valid points and good, straightforward ways of doing things in a perfectly engineered world, where they work until they don't.<p>Problem is, our world is rarely perfectly engineered. Our perfectly engineered application becomes a service to other applications. Some other team deploys something that affects our dependencies. An external partner hammers our API. Our managed service provider has an issue. Noisy neighbors cause shock waves, gray failures compound. These things happen irrespective of whether your code is correct or not.<p>Unless your service architecture is small, "perfectly engineered" is an anti-pattern because it is too expensive to track down and code against such events, no matter whether we run in Kubernetes/Istio or elsewhere. Operational challenges always require operators.

How Canary Deployments Work in Kubernetes, Istio and Linkerd

3 comments

How Canary Deployments Work in Kubernetes, Istio and Linkerd

3 comments