TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Principles of Chaos Engineering (2018)

133 点作者 archielc将近 6 年前

7 条评论

KenanSulayman将近 6 年前
We built a system called „friendly fire“ that nukes a server every 10 minutes. It has changed the mindset of all engineers and made our infrastructure missile-proof.<p>Funnily enough it also improved our latencies a lot (which I guess is mostly due to memory leaks et al.)
评论 #20185939 未加载
评论 #20185928 未加载
评论 #20192129 未加载
jinqueeny将近 6 年前
The following link shows how we do Chaos Engineering in TiDB, an open source distributed database:<p><a href="https:&#x2F;&#x2F;www.pingcap.com&#x2F;blog&#x2F;chaos-practice-in-tidb&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.pingcap.com&#x2F;blog&#x2F;chaos-practice-in-tidb&#x2F;</a><p>Regarding the Fault Injection tools we are using:<p>- Kernel Fault Injection, the Fault Injection Framework included in Linux kernel, you can use to implement simple fault injections to test device drivers.<p>- SystemTap, a scripting language and tool diagnose of a performance or functional problem.<p>- Fail, gofail for go and fail-rs for Rust<p>- Namazu: a programmable fuzzy scheduler to test a distributed system.<p>We also built our own Automatic Chaos platform, Schrodinger, to automate all these tests to improve both efficiency and coverage
jtms将近 6 年前
I have not used it, but I have heard this is a very useful tool <a href="https:&#x2F;&#x2F;github.com&#x2F;Netflix&#x2F;chaosmonkey" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Netflix&#x2F;chaosmonkey</a>
评论 #20185286 未加载
评论 #20185281 未加载
azhenley将近 6 年前
Other useful materials:<p>- Chaos Monkey Guide for Engineers <a href="https:&#x2F;&#x2F;www.gremlin.com&#x2F;chaos-monkey&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.gremlin.com&#x2F;chaos-monkey&#x2F;</a><p>- Recent HN discussion on Resilience Engineering: Where do I start? <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=19898645" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=19898645</a>
jorblumesea将近 6 年前
If you&#x27;ve never run a chaos experiment, how do you square up blast radius with running in prod?<p>It seems like this setup works great if built from the get-go but incredibly painful and possibly dangerous if starting with existing applications.
评论 #20188400 未加载
dang将近 6 年前
A thread from 2018: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16244586" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16244586</a>
agumonkey将近 6 年前
I see no mention of AFL which seems like a fitting tool for the topic.<p>Also the term &#x27;antifragile&#x27; (lightly controversial) comes to mind.