I checked their health status page. All is good. /s<p><a href="https://downdetector.com/status/aws-amazon-web-services/" rel="nofollow">https://downdetector.com/status/aws-amazon-web-services/</a>
They really need to stop requiring SVPs or higher to show non-green status on the status page, as other HNers have revealed in last week's AWS post. It's effectively not a status page, and they could probably be sued if it can be demonstrated that X service was down but the status page showed green (since the SLA is based on status page). Should be automated and based on sample deployments running in every region and every service. And they should use non-AWS instances to do the sampling, so they can actually sample when, say, we experience the obligatory black friday us-east-1 outage every year.
I tried to monitor services status using <a href="https://stop.lying.cloud" rel="nofollow">https://stop.lying.cloud</a>, but they are also hosted to AWS, and down too.
I wonder if AWS will make more or less money from these outages?<p>Will large players flee because of excessive instability? Or will smaller players go from single-AZ to more expensive multi-AZ?<p>My guess is that no-one will leave and lots of single-AZ tenants who should be multi-AZ will use this as the impetus to do it.<p>Honestly, having events like this is probably good for the overall resilience of distributed systems. It's like an immune system, you don't usually fail in the same way repeatedly.
This outage is extremely frustrating to me. My company hosts all our apps in gov cloud. Gov Cloud West 1 is also down, but the AWS Gov Cloud status page indicates that everything is healthy and green. I thought AWS's incident response to the East outage last week was that they'd update the status page to better reflect reality.<p>Gov Cloud Status Page: <a href="https://status.aws.amazon.com/govcloud" rel="nofollow">https://status.aws.amazon.com/govcloud</a>
It's not just AWS - check the down reports:<a href="https://downdetector.com/" rel="nofollow">https://downdetector.com/</a><p>Cloudflare having some significant issues as well on certain domains.
Is it AWS or could it be an ISP?<p>AWS seems to be working for me, but I’ve worked with clients in the US and spectrum internet tended to drop connections to us sporadically, which looks like an outage to our clients but is something we obviously can’t control.
I'm so glad that I'm not still the CTO of a startup. I would be getting dozens of e-mails from people without engineering backgrounds asking "Are we multi-cloud", "why didn't you make us multi-cloud"?
There was a brief period of time back in the early 90's where I felt I understood how Linux worked -- the kernel, startup scripts, drivers, processors, boot tools, etc... I could actually work on all levels of the system to some degree. Those days are long gone. I am far removed from many details of the systems I use today. I used to do a lot of assembly programming on multiple systems. Today I am not sure how most of the systems works in much detail.
That was fun. Badges weren't working (daily checkin required) so the front desk had to manually activate them.<p>Slack wasn't sending messages and Pagerduty was throwing 500's.
Yep, it's broken again. I was trying to install some Thunderbird extensions, and stuff started breaking halfway through. Never thought of an AWS outage borking my mail client I guess...
We're having issues connecting to our EC2 bastions and accessing the us-west-1 dashboard too<p>EDIT: Cognito auth seems down for us too<p>EDIT2: our ALBs are timing out as well<p>EDIT3: us-west-1 looks like working now!
How much do you guys think these frequent outages will effect their market share in cloud products?<p>Is this enough of a push for organizations to actually move over their infrastructure to other providers?
Reminder that the internet was literally invented to avoid this kind of nuclear attack. But i guess people are herdish animals and prefer to die as a group
We're having troubles in us-west-2.<p>Discourse is reporting trouble, too. <a href="https://twitter.com/DiscourseStatus/status/1471140369899290628" rel="nofollow">https://twitter.com/DiscourseStatus/status/14711403698992906...</a>
AWS status page shows an update:<p>> AWS Internet Connectivity (Oregon): 7:42 AM PST We are investigating Internet connectivity issues to the US-WEST-2 Region.<p>Source: <a href="https://status.aws.amazon.com" rel="nofollow">https://status.aws.amazon.com</a>
It is surprising that their status page is down too:<p><a href="https://status.aws.amazon.com" rel="nofollow">https://status.aws.amazon.com</a><p>Their CDN, CloudFront, always works reliable for me. Couldn't they put the status page on CloudFront?
I'm seeing outages on us-west-2 too. Customer facing traffic being served through Route53 -> ALB -> EC2 is down and CLI tools are failing to connect to AWS too.
Vercel is down too.<p>My sites run on Cloudflare and Vercel, and I can't even log in to those right now.<p>I'm curious — what does Hacker News run on? It seems impervious to any kind of downtime...
Wow, yeah, us-west-1 AND us-west-2 are reporting connectivity issues. I'm guessing this is related to the Auth0 outage that's currently going on too.
Tangentially related: On Friday Backblaze and B2 were down for 10+ hours to update their systems for the log4j2 vulnerability. Seemed noteworthy for the HN crowd and I posted a link to their announcement when the outage began. However, the post was quickly flagged and disappeared. Genuinely curious, why is announcing some outages ok and others not?
An honest question. Why do you guys use AWS instead of dedicated servers? It's terribly expensive in comparison, nowadays equally complex, scalability is not magic and you need proper configuration either way, plus now the outages become more and more common. Frankly, I see no reason.
Root logins are suffering some kind of "captcha outage." The buzz has just begun <a href="https://twitter.com/search?q=aws%20captcha&src=typed_query" rel="nofollow">https://twitter.com/search?q=aws%20captcha&src=typed_query</a>
looks specific to certain (possibly AWS hosted or partially dependent) services such as Auth0:<p><a href="https://status.auth0.com/" rel="nofollow">https://status.auth0.com/</a><p>e.g. our services running on AWS are fine right now, but new sessions dependent on Auth0 are not.
My personal health dashboard on AWS shows "InternetConnectivity operational issue
us-west-2"<p>[07:42 AM PST] We are investigating Internet connectivity issues to the US-WEST-2 Region.
Asking as a non-cloud-developer: why would Crunchyroll's recovery [0] lag so much behind AWS's recovery [1]?<p>[0] <a href="https://downdetector.com/status/crunchyroll/" rel="nofollow">https://downdetector.com/status/crunchyroll/</a><p>[1] <a href="https://downdetector.com/status/aws-amazon-web-services/" rel="nofollow">https://downdetector.com/status/aws-amazon-web-services/</a>
It appears AWS Status Page is hosted at AWS [0].<p>Seems like a really bad idea.<p>[0] <a href="https://hostingchecker.com/" rel="nofollow">https://hostingchecker.com/</a>
I'm on us-east-1 and everything is fine for me including:<p>* EC2 instances<p>* AWS Workspaces<p>* FSx for Windows<p>* AWS Directory Service<p>* S3 Buckets
Even as a software engineer, I think I could build from primitive materials a couple of battery operated transceivers to replace the signal flags or horsemen for critical communications. A little basic physics and materials science goes a long way.
Kentik data on the outage: <a href="https://twitter.com/DougMadory/status/1471162450649223173" rel="nofollow">https://twitter.com/DougMadory/status/1471162450649223173</a>
"Hey boss, that thing that took down us-east-1... that can't take down us-west-1 next week, can it?"<p>"No, no, of course not"<p>"Should I check?"<p>"No, don't waste time checking, get back to your TPS reports"
Yes, seeing it too.<p>Seems to be down in a major way. Lots of various AWS services are down. However, so many things depend on AWS that it could just be EC2 is down and it is causing a rippling affect.
ListenNotes.com has servers running on us-west-2.<p>One issue is that outbound requests from our servers us-west-2 timeout. Other than that, it seems that we are running ok so far.
Is that related to the current NPM status (<a href="https://status.npmjs.org/" rel="nofollow">https://status.npmjs.org/</a>)?
Systems manager in eu-central-1 is giving us some issues now, but I am not sure about their internal architecture for it, so maybe needs some us resources?
AWS Global Accelerator not working correctly anymore as well, connections dropped worldwide. Seems like it is managed from us-west-2 and not redundant.
obligatory comment about status page showing seas of green: <a href="https://status.aws.amazon.com" rel="nofollow">https://status.aws.amazon.com</a>
HOST THE GODDAMN STATUS PAGE ON AZURE FOR FUCKS SAKE.<p>There is zero excuse for this shit. Be professional. Acknowledge reality. It is logically impossible to run your own status page. Trying to do so just wastes everyone else on the internet's time when you have an outage.
We are barbarians occupying a city built by an advanced civilization, marveling at the hot baths but know nothing about how their builders keep them running. One day, the baths will drain and anyone who remembers how to fill them up will have died.