Hi, I work at Docker. Here is my reply on the github thread: <a href="https://github.com/docker/docker/issues/23203#issuecomment-223326996" rel="nofollow">https://github.com/docker/docker/issues/23203#issuecomment-2...</a><p>I am copying it below:<p><<<
Hi everyone. I work at Docker.<p>First, my apologies for the outage. I consider our package infrastructure as critical infrastructure, both for the free and commercial versions of Docker. It's true that we offer better support for the commercial version (it's one if its features), but that should not apply to fundamental things like being able to download your packages.<p>The team is working on the issue and will continue to give updates here. We are taking this seriously.<p>Some of you pointed out that the response time and use of communication channels seem inadequate, for example the @dockerststus bot has not mentioned the issue when it was detected. I share the opinion but I don't know the full story yet; the post-mortem will tell us for sure what went wrong. At the moment the team is focusing on fixing the issue and I don't want to distract them from that.<p>Once the post-mortem identifies what went wrong, we will take appropriate corrective action. I suspect part of it will be better coordination between core engineers and infrastructure engineers (2 distinct groups within Docker).<p>Thanks and sorry again for the inconvenience.
>>>
From the GitHub issue thread, I see a lot of people being angry for their production deployments failing. If you directly point to an external repo in your production environment deployments, you better not be surprised when it goes down. Because shit always happens.<p>If you want your deployments to be independent of the outside world, design them that way!
>Does this mean that Docker -- a major infrastructure company -- does not have any on-call engineers available to fix this?<p>It appear to be that way. Reminds me when all of the reddit admins were stuck on a plane on the way back from a wedding [1].<p>Remember kids, improve your bus factor.<p><a href="http://highscalability.com/blog/2013/8/26/reddit-lessons-learned-from-mistakes-made-scaling-to-1-billi.html" rel="nofollow">http://highscalability.com/blog/2013/8/26/reddit-lessons-lea...</a>
It's scary how most people in that thread seem to be more concerned about forcing an installation, rather than pause and consider why the hashes might be wrong and why it might not be a good idea to install debs with incorrect hashes.<p>If the apt repo was compromised (but the signing keys were not), this is very likely exactly the symptom that would appear.
This is a really bad title. There is nothing wrong with either Ubuntu's or Debian's repositories. The problem is with Docker's repositories of Ubuntu/Debian packages.
I'm a bit disappointed that people are willing to make public criticisms of Docker when it's their builds that are failing. They made the decision to depend on a resource that could be unavailable for a large number of reasons entirely unrelated to Docker or their infrastructure.<p>Just like the node builds that failed this should cause you to rethink how you mirror or cache remote resources not prompt you to complain about your broken builds on a github issue page. There may be things you'll never be able to fully mirror or cache (or could just be entirely impractical) but an apt repository is definitely not one of them.
... which is why the clever sysop mirrors his packages and tests if an update goes OK before updating the mirror.<p>If you're running more than three machines or regularly (re)deploy VMs, it is a sign of civilization to use your mirror instead of putting your load on (often) donated resources.<p>It's the same stupid attitude of "hey let's outsource dependency hosting" that has led to the leftpad NPM desaster and will lead to countless more such desasters in the future.<p>People, mirror your dependencies locally, archive their old versions and always test what happens if the outside Internet breaks down. If your software fails to build when the NOCs uplink goes down, you've screwed up.
I often wonder why the community's response to issues with an open/free/community package is to give the maintainers a strong argument to discontinue it in favour of a commercial one, or just abandon it altogether.
I think this is a combination of chains which are dependent on eachother, especially when you use Travis CI: 1st) apt-get not flexible enough to ignore that error on apt-get update 2nd) Travis CI having so much external stuff installed, it's a big big image which has more failure points 3rd) Docker repo failed.
Outages or mis-configurations can happen to pretty much any source of packages you use, be it debian, pypi, npm, bower or maven repositories, or source control. Anybody remember left-pad?<p>So as soon as you depend heavily on external sources, you should start to think about maintaining your own mirror. Software like pulp and nexus are pretty versatile, and give you a good amount of control over your upstream sources.