TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Auto Recovery for Amazon EC2

167 点作者 tshtf超过 10 年前

13 条评论

CarlHoerberg超过 10 年前
Finally, it's crazy that they've haven't implemented this earlier, and why isn't it enabled by default, like on GCE? We've had for a long time an app that just polls the ec2 api and looks for impaired instances and then automatically restarts them. We have about 2-10 impaired/scheduled-for-reboot/on-deprecated hardware-instance per month so that app is quite a time-saver.
bmurphy1976超过 10 年前
Please note that this is for EBS backed instances only.<p>If you want something similar for ephemeral instances, do what we do: min 1 max 1 auto scaling groups. We&#x27;ve found that Amazon is pretty good at catching bad instances and terminating them, although on occasion we do have to terminate an instance manually. The autoscaling group takes care of the rest.
评论 #8913950 未加载
oellegaard超过 10 年前
Heavy EC2 user here. This doesn&#x27;t solve your problems, if you want to do this right, setup an EC2 Auto Scaling group and build an image each time you need to change your server. That is the proven way most large deployments work, including Netflix.
评论 #8914892 未加载
评论 #8914684 未加载
评论 #8914785 未加载
bkeroack超过 10 年前
At the risk of being down voted, let me say that this is yet another AWS &quot;feature&quot; that is primarily a workaround for deficiencies in the platform.
评论 #8916005 未加载
评论 #8915702 未加载
biot超过 10 年前
Any reason why this isn&#x27;t automatic? From the &quot;Recover your instance&quot; docs:<p><pre><code> Examples of problems that cause system status checks to fail include: * Loss of network connectivity * Loss of system power * Software issues on the physical host * Hardware issues on the physical host </code></pre> All of these are on the physical host, which end users cannot control. So if AWS has an issue that kills your VM, if you don&#x27;t have this setup then your instance is effectively dead?
评论 #8914268 未加载
评论 #8915035 未加载
评论 #8914810 未加载
alrs超过 10 年前
The ugly caveat isn&#x27;t VPC, it&#x27;s EBS.<p>This lands on the wrong side of pets-versus-cattle. AWS has been moving towards giving people what they want, but it&#x27;s still best practice to use ephemeral storage and architect accordingly.
评论 #8914660 未加载
评论 #8914606 未加载
评论 #8914984 未加载
saryant超过 10 年前
I&#x27;ve been having a lot of issues with r3.large instances becoming unreachable lately. Hoping this can serve as a stopgap.
评论 #8913896 未加载
andr超过 10 年前
I think CodeDeploy is quite an undervalued AWS tool. It&#x27;s a combination of Puppet for server config and Heroku-style deploys. Together with AutoScaling it makes it trivial to set up any number of identical servers, without relying on custom AMIs or recovery.
tedunangst超过 10 年前
Wouldn&#x27;t transparent migration to new hardware be even better? Isn&#x27;t one of the advantages of virtualization the ability to move a running image from one machine to another?
评论 #8915005 未加载
fletchowns超过 10 年前
An important note if you want to use this right away:<p><i>This feature is currently available for the C3, C4, M3, R3, and T2 instance types running in the US East (Northern Virginia) region; we plan to make it available in other regions as quickly as possible. The instances must be running within a VPC, must use EBS-backed storage, but cannot be Dedicated Instances.</i>
评论 #8914247 未加载
评论 #8914058 未加载
j-kidd超过 10 年前
This shall be a great fit for the NAT&#x2F;Bastion instance, since the high-availability setup has a few drawbacks: <a href="https://aws.amazon.com/articles/2781451301784570" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;articles&#x2F;2781451301784570</a>
kolev超过 10 年前
If you rely on something like this, you rely on nothing. This is like crutches for your broken architecture. For singleton roles, you could do an autoscaling group of one and do better.
评论 #8915289 未加载
halayli超过 10 年前
This makes me so happy.