TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How Southwest Airlines melted down

283 pointsby wallflowerover 2 years ago

41 comments

nthitzover 2 years ago
<a href="https:&#x2F;&#x2F;archive.ph&#x2F;J3pFF" rel="nofollow">https:&#x2F;&#x2F;archive.ph&#x2F;J3pFF</a>
burlesonaover 2 years ago
It’s fascinating that the same hopscotch travel pattern that allows SWA to offer better service to more places is also what caused the network to suffer cascading failure. Once a critical mass of pieces (planes&#x2F;crew) were out of position the whole network fell apart, and it’s large enough that it seems like neither the humans nor software can easily reason about how to resume operations. Hence the need for a “full system reboot” over many days.<p>Anecdotally, I flew Southwest just before Christmas. The network was already buckling and we had major delays, but we were lucky and made it through. Despite the stress, the SWA crews were helpful, empathetic, and polite. They handled it better than I would have if I had been in their shoes.
评论 #34166211 未加载
评论 #34171233 未加载
评论 #34166494 未加载
评论 #34166989 未加载
评论 #34166222 未加载
评论 #34170143 未加载
评论 #34166122 未加载
icambronover 2 years ago
I&#x27;ve told this story a few times, but maybe 10 years ago I had a cross-country JetBlue flight that was delayed perhaps 6 hours hours. It was a few days after a major storm. Like Southwest here, JetBlue didn&#x27;t have much flex capacity and relied on the daisy chain to keep on chaining. Our plane had gotten stuck somewhere, so they had to find a different one at some far-away airport and fly it in, which took hours. But the kicker was that when the plane finally landed, the crew already onboard couldn&#x27;t man the flight because that would exceed their duty limits. The airline didn&#x27;t realize this ahead of time, so they had to gather a new crew (like literally call them in), which added a couple of hours to the delay.<p>Naively, I&#x27;d assumed these kinds of things were handled in some sort of mission-control center with warnings from rule engines blinking on some big screen and a team of crack operators mapping out what needed done. But clearly that wasn&#x27;t so: they were just making things up as they went along. Sounds like Southwest is in a similar spot, but this time on a much bigger scale.
评论 #34167133 未加载
评论 #34166209 未加载
评论 #34166289 未加载
nostromoover 2 years ago
The actual answer is buried at the end of a long article.<p>&gt; Unlike many rival airlines, Southwest’s planes generally hop from one city to another, rather than orbiting a major hub. That approach lets Southwest maximize use of its planes and crew, but the daisy chain structure also makes its network more delicate—problems in one corner of the country can be difficult to contain
评论 #34166060 未加载
评论 #34166035 未加载
ComputerGuruover 2 years ago
Southwest is statistically the worst airline in terms of delays and cancellations but has deluded its customers into thinking its the best (according to surveys asking people to rate airlines on their reliability).<p><a href="https:&#x2F;&#x2F;www.insidehook.com&#x2F;daily_brief&#x2F;travel&#x2F;airlines-fewest-delays-cancellations" rel="nofollow">https:&#x2F;&#x2F;www.insidehook.com&#x2F;daily_brief&#x2F;travel&#x2F;airlines-fewes...</a>
评论 #34170484 未加载
评论 #34167233 未加载
评论 #34170517 未加载
评论 #34168755 未加载
评论 #34168966 未加载
评论 #34166440 未加载
评论 #34166669 未加载
评论 #34166674 未加载
marzeover 2 years ago
I find it especially ironic that SWA system failed them, and this large failure was preceded by worse and worse &quot;near failures&quot;, since SWA is in the aviation business.<p>In the aviation arena, high reliability is maintained in part by careful analysis of &quot;near failures&quot;: lessons are extracted and improvements are made to aircraft designs, procedures, etc.<p>By contrast, the &quot;near failures&quot; of the SWA system as a whole don&#x27;t appear to have been utilized to motivate system improvements.
评论 #34170688 未加载
igetspamover 2 years ago
A friend of mine wrote on this topic today as well.<p><a href="https:&#x2F;&#x2F;www.seat31b.com&#x2F;2022&#x2F;12&#x2F;the-great-southwest-meltdown-of-2022&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.seat31b.com&#x2F;2022&#x2F;12&#x2F;the-great-southwest-meltdown...</a>
thepasswordisover 2 years ago
I&#x27;m surprised they haven&#x27;t tried to blame a cyberattack yet.<p>That said, I feel like these sorts of catastrophic ultra-fragile McKinsey-consulted-to-death failures we keep seeing in various industries are basically a giant signal to any adversaries that say &quot;Hi! Check out how easy it would be to grind this entire industry to a halt!&quot;<p>Resiliency is literally the <i>opposite</i> of efficiency. These systems need to have slack, aka <i>inefficiency</i> built into them. Unfortunately the business culture has moved towards ultra fragile, ultra efficient thinking.
评论 #34166483 未加载
评论 #34166504 未加载
twobitshifterover 2 years ago
<a href="https:&#x2F;&#x2F;blog.geaerospace.com&#x2F;technology&#x2F;big-wins-in-flight-efficiency-analytics-2&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.geaerospace.com&#x2F;technology&#x2F;big-wins-in-flight-e...</a><p>Skysolver is a GE Flight Services trademark - there’s a video here showing how it works and SW planes. Contrary to the reddit claim, it does appear to use a predictive algorithm.<p>Highlight quote from the video:<p>“It is humanly impossible when there’s a major disruption for somebody to figure out what the optimal approach is to get them back on schedule”
评论 #34171915 未加载
nlstitchover 2 years ago
I would be very interested in a post mortem of the software used called SkySolver. Its supposed to be a Java Application which is said to be developed by Accenture? Anyone have actual technical insights into why it failed?
评论 #34170072 未加载
评论 #34167462 未加载
ProAmover 2 years ago
Still my go to airline because the rest are so difficult, unfriendly or just greedy Id rather deal with Southwest every time to feel like I am a human being.
评论 #34166805 未加载
评论 #34167597 未加载
crosen99over 2 years ago
It&#x27;s easy to ask, &quot;How could this happen?&quot;, but it&#x27;s also a wonder this sort of thing doesn&#x27;t happen more often with airlines and other businesses that rely on solutions to complex logistical challenges at their core. Overall, despite the horrors of war, perils of a pandemic, etc., sometime I pause and ponder how remarkably well the world works.
calbear81over 2 years ago
I’ve been lucky to have caught a flight back to SF after cancellations and can wait at home while figuring out how to get to my original destination.<p>What I don’t understand is how come SW couldn’t enlist help to get customers rebooked on other airlines - their phone lines were slammed (I waited 3 hours) just to get a refund since their app wouldn’t allow me to choose to rebook&#x2F;cancel.<p>If I was as customer focused as they say they are - I would’ve contacted AMEX global travel and gotten their entire network of booking agents to backfill and rebook customers on other flights.
评论 #34167048 未加载
jrochkind1over 2 years ago
From what we know, this to me sounds like a story about technical debt.<p>&quot;Sure, it&#x27;s held together with rubber bands and is a mess, but it would cost hundreds of millions to fix, and it&#x27;s working, isn&#x27;t it? So the programmers complain a bit, that&#x27;s their job.&quot;<p>Which works until conditions change in some way and it catastrophically does not.<p>I think a lot of our society is now run on unreliable fragile software. I expect to see a lot more of this. &quot;Automation&quot; is especially cost-savings when you don&#x27;t min it being a fragile unreliable time-bomb.
w10-1over 2 years ago
So much chatter!<p>I would expect any interview candidate to spot the issue within a minute.<p>For hub systems, ready crews are either at the hub, or at a spoke, ready to come back. That gives the hub a queue of ready crews, and each spoke can return a crew-plane combination to the hub when available. So with natural queue&#x27;s, there&#x27;s no delay cascade: it&#x27;s all a function of whether and crew&#x2F;plane readiness.<p>For point-to-point systems, crew-plane&#x27;s are scattered, and the next flight opportunity might not be the next flight need. There is no buffer anywhere. Furthermore, any greedy&#x2F;opportunistic strategy at one point can block a superior global solution.<p>That&#x27;s the point-to-point trade-off taken by SWA. In the common case of good weather, you avoid the extra miles from going via hubs. But in the rare case of global weather shutdowns, there is no good recovery.<p>The only real question is whether SWA had any obligation to communicate this to investors and passengers. So far, Apple stock has gone down more than SouthWest&#x27;s in this period, and passengers are remaining loyal, so no damage done.
francisofasciiover 2 years ago
So they are blaming SkySolver software. The article says it is off-the-shelf software? But in other news reports, they make it sound like it was developed in-house.
评论 #34166265 未加载
评论 #34166624 未加载
评论 #34166485 未加载
评论 #34166890 未加载
评论 #34166690 未加载
评论 #34244945 未加载
crisduxover 2 years ago
I don&#x27;t buy the narrative that inadequate technology is the main reason for the Southwest debacle. We must ask, why did this happen now and not before? Southwest has previously been able to better deal with disruptions like this. While the weather event did happen in the middle of their network, it wasn&#x27;t unprecedented.<p>I think a more obvious reasons is because of staffing issues brought on by covid, layoffs, and the vaccine mandates. They lost experienced employees who were able to wrangle the bad scheduling software. Throughout 2022, Southwest was having hiring issues because they were still mandating the vaccine through at least the summer for new employees. Their pilots association warned about this causing disruptions after a bunch of summer cancellations. Do people forget how flaky Southwest was during summer 2022? Southwest just recently reached staffing levels that matched their 2019 high. This &quot;inadequate technology&quot; narrative just seems like a convenient scapegoat.
评论 #34166393 未加载
评论 #34166323 未加载
评论 #34166759 未加载
评论 #34169107 未加载
评论 #34166330 未加载
paulpauperover 2 years ago
For a meltdown the stock is back to where is was in October, tracking other airlines and the overall market, which keeps falling. I think people have become so accustomed this sort of stuff that it does not affect business long term. After Covid, people are accustomed to major inconvenience when traveling.
mise_en_placeover 2 years ago
You only get bitten in the ass by tech debt after it’s too late. I’m sure management justified not paying it down because, truthfully, the consequences are never really felt until it’s too late. It’s better to pay down tech debt incrementally, instead of grand projects promising full rewrites.
zx8080over 2 years ago
Aren&#x27;t cases like this is where the automated solvers are expected to shine?<p>If, on the other hand, it&#x27;s not at all about software failures as many comments here suggest (&quot;company management lost track of crews&quot; notion), then does it have something to do with software at all?
评论 #34170941 未加载
christkvover 2 years ago
I thought this stuff also gets more likely to happen as you get towards the end of the month as you are close to the max number of hours the pilots can fly per month (100h in the us per calendar month), making pilot shortage cascades even more likely to happen.
评论 #34166542 未加载
lowbloodsugarover 2 years ago
They rely on <i>telephones</i>???
评论 #34165983 未加载
评论 #34166115 未加载
pledessover 2 years ago
apparently one other factor is that long on-hold phone calls (by Southwest staff) count against FAA work time limits, and thus can prevent that person from serving on a flight - according to the former CCO of JetBlue (St. George) at <a href="https:&#x2F;&#x2F;twitter.com&#x2F;martysg&#x2F;status&#x2F;1608161473083183106" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;martysg&#x2F;status&#x2F;1608161473083183106</a>
GeoffKnauthover 2 years ago
I wonder if there&#x27;s an accurate record (time series) of what got canceled when, why, in what order, along with an accurate starting state for the whole SWA system. Then, for Monday morning quarterbacking, modelers and scientists could step through the meltdown in slow motion, and see, at each step, what they think could have been done differently.
Spivakover 2 years ago
None of this actually explains what happened. Okay they have an off-the-shelf product called SkySolver that they use to manage their flights and it’s old (I guess the traveling salesman problem has really changed since 1930) and it couldn’t handle the sudden change in resource availability and flight constraints.<p>But… why? What actually happened?
评论 #34166627 未加载
评论 #34166274 未加载
alkonautover 2 years ago
So if I understand this correctly, almost the only thing that would have been needed would be a system of self-reporting from crews where they and their plane are, in case they are delayed or diverted?<p>The &quot;manual entry&quot; system is already there, so not much seems to be needed other than adding this seemingly simple front end. You don&#x27;t muck around with old business critical software, you add to it whenever possible. And this seems to be exactly like that. The existing manual entry system feels like it should be able to accept self-reports from a mobile app or web site with minimal changes to the existing system.
ChoGGiover 2 years ago
&quot;Mr. Jordan said he believed if Southwest could power through a few tough days, things would improve—its usual playbook.&quot;<p>I feel for the employees
dehrmannover 2 years ago
Anyone remember the A&amp;E show Airline from 20 years ago? This would have made quite the episode.
dpacmittalover 2 years ago
This is probably the biggest failure caused by software going into a bad state.
tonymetover 2 years ago
none of these accounts review the human factors. Experienced administrators , pilots and crew were terminated over the past year and the replacement staff is not as experienced . Not everything is an algorithm error
TexasDawgover 2 years ago
As someone with over 20 years experience developing decision support models for domestic and international airlines, here&#x27;s my take.<p>Southwest Airlines is a domestic carrier flying a point to point schedule, using the same fleet type to eliminate the need to train pilots on multiple aircraft type and also reduce training costs for transitional training from a narrow to a wide-body jet. All pilots can (theoretically) fly all planes in their fleet, tremendous training and labor savings. When flight crews &quot;bid&quot; on their flights, having only one aircraft type also reduces the complexities of bid lines down to basic seniority. (Let&#x27;s ignore over water to Hawaii, m&#x27;kay?)<p>So, getting SWA back in the air should be over simplified compared to every carrier who got their planes and crews back flying the same or next day - get it? SWA has a single fleet and all of their crew is qualified to operate those jets or crew the cabin - so &quot;wat da problem is&quot;??? US Gov, inquiring minds want to know!!!<p>At all airlines, crews and planes are scheduled, and optimized, using decision support models which take into account how many hours the crew (pilots and flight attendants) can fly by law and contract; how long each tail# can fly until it needs to be flown into a maintenance base for an A&#x2F;B&#x2F;C&#x2F;D check for scheduled maintenance, and how to get the maximum airtime out of the asset per day - in perfect weather conditions.<p>There are also decision support systems that monitor pricing of competitors every second of the day and why your airfares change from one browser refresh to another, yield management models which run overnight taking input from industry load data from the past year in the market SWA flies to help predict the passenger &quot;load&quot; which sets fares and also permits manual inputs for special events such as a World Series, or Super Bowl etc which would spike demand and drive airfares higher.<p>And there are Air Operations systems which are similar to the named product which take into account weather and crew events and help an airline re-plan based on where crew is currently. These Ops systems should have interfaces built to the crew (pilot and flight attendant) systems to know where they are located as well as where the jets are. Those values, along with the number of hours the crew has worked, would be used to re-calculate the crew and fleet assignments with the associated fleet and crew scheduling decision support models. These DSS ran on either big iron multiple CPU Unix servers or multiple CPU Linux servers - point being the computational power was outstanding and yes, CPLEX was typically a library utilized by our PhD&#x27;s. There was a lot of money spent on the hardware and the people who developed these models - it wasn&#x27;t cheap but then none of the clients ever experienced this type of problem with scheduling and yes, the clients are named in the weather impact article alongside SWA, but they are up and running either same or next day...<p>My world had a common database and data model where all data was integrated from various systems regardless of if it was our system or not because a decision support system without current and accurate data is like having corn cobs for toilet paper vs. Charmin... painful and not a lot of value. This is the one time I&#x27;m looking forward to the gov looking under the hood of private industry and revealing where the technical and management issues are. You can&#x27;t blame the technical teams as they&#x27;re only paid what SWA pays to hire both FTEs and contractors, and why I&#x27;ve never had a phone call that lasted beyond finding out what the position paid. We all know, you get what you pay for, but again, my opinion, and I&#x27;m greedy!!!
评论 #34177183 未加载
snambiover 2 years ago
Too much centralization. Centralization is bound to fail eventually. It happened.
dougbover 2 years ago
A Friend of a friend is a technical recruiter for SWA. They told me that SWA pays their Software Engineers below market salaries. They complained that because of that, its hard to recruit people, and they end up with mediocre developers.
评论 #34184867 未加载
eternalbanover 2 years ago
tldr: SkySolver O(n^2) meets a somewhat large n.
bmitcover 2 years ago
Doesn’t it feel that, since COVID, we keep hearing more and more about these colossal, systemic failures more frequently? It makes me concerned because it seems most assume that at least <i>someone</i> knows what’s going on. Whereas the reality is that <i>almost no one, if anyone</i>, actually knows what’s going on, and that we have fragile software systems “running” everything. And COVID seems to have been a real shot to the brow for many of these systems.<p>I just worry about what hell we’re creating. Software basically captures miscommunication and poor understanding and executes it.
评论 #34166663 未加载
评论 #34169809 未加载
评论 #34166607 未加载
评论 #34184916 未加载
bottlepalmover 2 years ago
Anyone else work at a large company with a massively complex system that basically controls everything, and a skeleton crew of devs that know how it all works?
评论 #34166010 未加载
评论 #34165982 未加载
评论 #34165995 未加载
评论 #34167162 未加载
评论 #34165998 未加载
评论 #34166432 未加载
candiddevmikeover 2 years ago
Their microservice rewrite will be ready any day now
评论 #34167496 未加载
评论 #34166093 未加载
LetsGetTechniclover 2 years ago
It&#x27;s good to see that the billions of dollars in federal bailouts they received during the pandemic was put to good use and not given to executives with salaries already in the millions of dollars and stakeholder dividends... oh wait...
评论 #34166169 未加载
colechristensenover 2 years ago
There really should be laws or regulations that fine airlines for this kind of behavior and compensate passengers. Reliability failures that have that much impact on people need more consequences than just seeking another airline next time.
评论 #34166008 未加载
评论 #34166031 未加载
评论 #34166020 未加载
评论 #34165990 未加载
评论 #34166015 未加载
评论 #34166033 未加载
评论 #34165968 未加载
评论 #34166141 未加载
评论 #34166683 未加载
juujianover 2 years ago
Insane that the US has put all its eggs into one basket. One bad event and interregional passenger travel completely breaks down. Looks like it would be wise to invest in a more resilient mode of transportation.
评论 #34166151 未加载
评论 #34166398 未加载
评论 #34166357 未加载
coliveiraover 2 years ago
Nobody seems to figure out the core problem for airlines: they&#x27;re selling tickets that they cannot fulfill under current conditions. If they know they don&#x27;t have enough pilots, enough attendants, and so on, why in the world are they selling all the tickets they&#x27;ve &quot;planned&quot;? Is this a realistic plan or it is a SCAM, i.e., taking people&#x27;s money for a service that they know they won&#x27;t be able to fulfill in the future? These companies need to be investigated!
评论 #34166780 未加载
评论 #34166429 未加载
评论 #34166841 未加载
评论 #34166610 未加载