TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

DNS Outage Post Mortem

55 点作者 streeter超过 11 年前

6 条评论

jjoe超过 11 年前
These are some of the corner cases that are put on the back burner en route to delivering an MVP. Just like you don&#x27;t do early optimization of an infrastructure, you almost never enumerate all possible issues that can crop up under a less than ideal situation.<p>This isn&#x27;t a GitHub only issue but rather one that would affect all quick-to-launch startups (most). What I&#x27;m learning from this is that one needs to regularly revisit the infrastructure and how it&#x27;s glued together with the provisioning system.<p>If it&#x27;s not broken, break it.
评论 #7083036 未加载
bscanlan超过 11 年前
&quot;an initial verification led us to believe the changes had been rolled out successfully&quot;<p>I would love more detail in the type II error in this validation step, and is worth exploring deeper. What was the verification step? Why did it not detect the issue? What review process was used for the verification step?<p>While the failed verification step is not the root cause, having good safety checks are the most important part of planning good changes, whether they&#x27;re DNS reconfigurations, network changes or software deployments.
评论 #7083460 未加载
badmadrad超过 11 年前
This is why i like Chef...i feel there are tools out there to test your code better....FoodCritic...ChefSpec...Test Kitchen...before rolling to production and having to validate machines in production....ouch
overworkedasian超过 11 年前
Who in the right mind would schedule a critical infrastructure upgrade during the day?
评论 #7081973 未加载
评论 #7082030 未加载
评论 #7082454 未加载
评论 #7082066 未加载
评论 #7082164 未加载
iwasphone超过 11 年前
13:20 PST = 16:20 EST
nullrouted超过 11 年前
What a seriously dumb outage to have. I&#x27;m still confused about it after reading the the RFO.