Nice reminder about a "glitch" happening in one of the datacenters use at the airline I used to work.<p>Went like this: Guy who shows around the new datacenter/ops guy demonstrated how the emergency power off works by lifting the protection plate. Protection plate unhinges suddenly and droppes onto emergency power off button. Hilarity ensues.
I have a nagging (intuition? feeling?) that software safety/reliability/security needs are going to explode soon (because unreliabilities multiply in non-resilient systems interacting with each other) and that these are simply foreshocks.<p>(yeah, I know security is already a huge deal, but as we come to trust software systems more and more, the safety/reliability factor will come more into play)<p>EDIT: This is also part of the reason I've been learning Elixir (<a href="http://elixir-lang.org/" rel="nofollow">http://elixir-lang.org/</a>) since it's based on the highly-resilient Erlang and is designed to embrace failure. This was also informed by me reading Nassim Taleb's book "Antifragile" as well as "Thinking in Systems: A Primer" by the (late) Donella Meadows.
I worked on United's computer systems for a year (never that one though), and so I get nervous when I see a headline like that. True story: one of their systems still runs on a mainframe that has 9 bits in a byte!
I knew someone at United that once offered to give me a tour of one of their data centers: "It is like a computer museum - we have one of everything." Hard to imagine that they would have problems as a result. United is a really, really bad airline.
I bet this is an issue with an old mainframe used somewhere in the booking system, something that has worked well but is difficult to fix when things go wrong.<p>I think there is / will be a lot of money to be made trying to solve the problem of software security and reliability. This is obviously an extremely difficult problem, however the number of ancient systems that we currently have interconnected I think more large scale outages like this are inevitable.
Ouch. And only a month after another major systems outage: <a href="http://www.wired.com/2015/06/united-flights-grounded-mysterious-problem/" rel="nofollow">http://www.wired.com/2015/06/united-flights-grounded-mysteri...</a>
Was reading this on HN and heard it on NPR simultaneously.<p>I have a sneaking suspicion that booking systems for most airlines run atop legacyware. It just seems like the type of thing that would've been put in place long ago and then be very expensive to migrate/updgrade.
Not a word about it on United's web site. Flight status page doesn't load correctly. "Today's Operations" gives an error message. United's Twitter is silent.<p>Meanwhile news articles and twitter complaints abound.
<a href="http://mashable.com/2015/07/08/united-computer-problems-flights-grounded" rel="nofollow">http://mashable.com/2015/07/08/united-computer-problems-flig...</a>
United is a mess. I had the misfortune of flying with them a couple months ago. I ended up in a city a 2 hour drive from my actual destination and had to rent a car on my own to get to where I was going.<p>That was the worst of it, but almost every flight I saw on the way (both ones I was on and other flights at nearby gates) was delayed or overbooked or otherwise messed up in some way.
Very little news about this on Google News, but heard over the local Chicago ABC affiliate that the FAA attributed this to an "automation error"<p>Edit: And its Twitter account has been relatively inactive, with more than 30 minutes since the last reply-to or general tweet...presumably a lot complaining tweets have come in in the last 30 minutes <a href="https://twitter.com/united/with_replies" rel="nofollow">https://twitter.com/united/with_replies</a>
Interesting bits:<p>"Departing DEN; taxied and then returned to gate. Pilot says nationwide failure of "three or four" computer systems. Only information from airport staff is that since the computers are down UAL can't book pax onto any other airline ..."<p>"Systemwide Ground Stop posted at FAA:
Due to USER REQUEST DUE TO AUTOMATION ISSUES. UAL AND SUBS ONLY., departure traffic destined to ALL airport will not be allowed to depart until at or after 13:15 UTC."
Considering just yesterday their flight system didn't believe they flew from SFO for 2+ hours in the morning (I have screenshots), i'm not all that shocked.
Off-topic: Please use ISO 8601 format(YYYY-MM-DD) for dates in titles. The US date format hurts my poor logical soul.<p><a href="https://en.wikipedia.org/wiki/ISO_8601" rel="nofollow">https://en.wikipedia.org/wiki/ISO_8601</a>
That's twice in a short time now - 'coincidence' on Ian Fleming's scale. A third time is enemy action. Airlines do seem like a pretty juicy target for cyber war operations - you can cause a gigantic amount of disruption with a successful attack on a single system.