TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Were you able to mitigate the impact of the AWS us-east-1 incident? How?

7 点作者 melor超过 8 年前

2 条评论

deftnerd超过 8 年前
I run my infrastructure on three different providers and use GeoIP assigned AnyCast DNS servers from another provider.<p>Asia&#x2F;Australia is run on Digital Ocean, Europe is on OVH, and the Americas is on AWS.<p>When someone requests the IP address of my site&#x27;s front-end domain or static asset CDN domain, my nameserver determines their geographic location and returns the IP address of the closest resources to them.<p>I run health checks so when S3 went down, which I use to host my static assets for the Americas, my nameservers quit giving out the IP addresses for the Americas systems and started giving out IP addresses for the Europe systems.<p>When health checks started being successful again, everything restored itself.<p>Due to low DNS TTL values, users in the Americas were only impacted for a few minutes and that&#x27;s if the IP was cached by their system.
评论 #13766171 未加载
melor超过 8 年前
We host a number of our customers&#x27; database systems on us-east-1.<p>What worked well for us (<a href="https:&#x2F;&#x2F;aiven.io" rel="nofollow">https:&#x2F;&#x2F;aiven.io</a>):<p>- Architecturally relying only to a few cloud provider services (only need VMs, disk, object storage)<p>- Upfront investment on being able to move services from one region to another without downtime<p>- Pre-existing tooling for easily (manually) reconfiguring backup destinations on the fly<p>- Not running everything on just AWS<p>What did not work so well:<p>- Backups should automatically reroute to a secondary backup site on N consecutive failures<p>- Alert spam, need more aggregation<p>- New failure mode: extremely slow EBS access, some affected VMs were kinda working, but very slowly: need to create a separate alert trigger for this