TechEcho

rdliabout 5 years ago

It's a bad outage, it seems. 16+ hours now, and counting, and they don't seem to have a root cause or a hot standby.<p>We just cut over all of our stuff (Ambassador API Gateway) to Docker Hub. Lots of the Kubernetes ecosystem is on Quay. I wonder how this affecting others. Our users are definitely affected, as well as our development team.

urlgreyabout 5 years ago

This literally cost me sleep last night, paging me for new Kubernetes nodes that failed to transition to 'Ready' because they were unable to pull Calico images during bootstrapping. After some duct-taping to get those initial nodes up and running, we just moved the Quay-hosted images to a GCR repository and moved on with life.<p>But that doesn't diminish the fact that this outage is a complete disaster for Quay.

mmostaabout 5 years ago

For those on k8s: quick reminder, unless you're in development mode your image pull policy should be IfNotPresent so you have at least some measure of caching to protect you from further degradation of service.<p>Beyond this, I'm updating things to use GCR so that if this bleeds into tomorrow my team's development timeline isn't impacted any further.

nishaad78about 5 years ago

Yup, we also had alerts for our k8s nodes not getting ready (calico images). We moved to Docker Hub, but experienced timeouts there a few times. I guess they got a sudden spike in traffic.<p>This will be bad for Red Hat's reputation.

pot8nabout 5 years ago

I was just going to post it. I am unable to work currently because of it.

Quay.io has been down for over 14 hours

5 comments

Quay.io has been down for over 14 hours

5 comments