Man, I really have the impression that my video-on-demand service has a better understanding of ensuring availability and risk-management than either my bank or any government service.<p>Could you imagine say a utility company creating a title called 'Chaos Engineer'.
Netflix puts out some great articles about architecture in the cloud. Auto-scaling, chaos monkey, and how they handle 'steal-time.' Does anyone know of any other company that publishes so much about cloud architecture? This is great stuff!
I wonder if Netflix has ever come out with some kind of, "So you want to get into chaos engineering, eh?" kind of article that explains the basics and some pitfalls/things to look out for.
I assume that other big internet companies also practice chaos engineering under a different name, but having such a name for the job is awesome. It highlights the difference to traditional stress testing. Names have surprising power. Growth Hacker was a bit annoying but very effective title trend and it helped to communicate the different approach to traditional marketing efforts. I think Chaos Engineer has the same potential.
It's great to see Netflix taking disaster recovery and chaos mitigation seriously. Learning to work with constant failure is one the biggest challenges to anyone working with distributed systems and scale, and concepts like the Chaos Monkey help enormously. I hope other companies follow suit, and soon.
How does one go about becoming a chaos engineer? I imagine up to this point it is a field one falls into accidentally and gains experience over time. I can imagine it becoming a topic taught at a college level in the near future.