Massive S3 outage. Seems to affect other AWS services (SDB, SQS) as well.<p>Other AWS services are down too ...
EC2 http://developer.amazonwebservices.com/connect/thread.jspa?threadID=19715&tstart=0
SQS http://developer.amazonwebservices.com/connect/thread.jspa?threadID=19713&tstart=0
Interestingly, my desktop applications did not go down this morning. It was right there waiting for me as I sipped my coffee.<p>It also didn't go down all the times the 37Signals web apps I pay for or the hosted FogBugz installation my company uses have gone down.<p>Desktop apps rock! :)
So it is 6 hours complete outage in around 22 months since its opening beta. The lifetime outage is somewhat around 6/(30 * 22 * 24) = 0.00037 = 0.037%! I think this is pretty impressive achievement to build a system with uptime as 99.963%. Especially for some poor engineers woke up at 2am in Seattle and started to figure out what went wrong and get it back on line. I think it is pretty cool.<p>In the case when our PCs/Macs crashed. Even I could rush to a Circuit city/JR store to get a replacement hard drive. I probably will spend the same amount of time just to revive my system, given I have good habit back up the system. If that is not the case, I will need to reinstall operating system and applications. I guess the down time may be 24 to 48 hours.<p>So the downtime for a person without good habit in backup. The uptime will be 99.849%! if it takes 24 hours to get back the system in 22 months.
I think the internet needs an S3 clone, offered by another company. Both companies would be better off because of eachother.<p>S3 is still more reliable than a couple of dedicated servers, though :)
Phew, back up. Although that the fact that it was possible to have the entire network go down is quite worrying.<p>S3 actually has an SLA; <a href="http://aws.amazon.com/s3-sla" rel="nofollow">http://aws.amazon.com/s3-sla</a> If I'm reading that right, if S3 is completely down for more than about 40 mins in Feb (which it was - about 90 mins by my count) then we should get a 10% discount for this month. Is that right?
Kathrin of the The Amazon Web Services Team has posted some more specific details on the failure here. In summary it seems their Authentication service was overloaded.<p><a href="http://developer.amazonwebservices.com/connect/message.jspa?messageID=79982#79982" rel="nofollow">http://developer.amazonwebservices.com/connect/message.jspa?...</a>
Our site seems to be running fine (EC2/S3). We actually have all files currently on EC2 and backed up to S3 (We haven't checked to see if the backup is still working yet)