TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Failed intercept at Dhahran caused by a software error in handling of timestamps

205 pointsby sineabout 7 years ago

19 comments

avarabout 7 years ago
Even better, the timeline:<p><pre><code> - February 11th: Vendor informed of the issue - February 25th: 28 people die because of the issue - February 26th: The vendor ships a fix </code></pre> I&#x27;d have loved to be a fly on the wall for that phonecall on the 25th (or early on the 26th).
评论 #16486789 未加载
评论 #16487137 未加载
评论 #16486549 未加载
评论 #16492529 未加载
tofofabout 7 years ago
This particular bug is often taught in university compsci classes as &quot;bug that killed people&quot; is a good attention grabber -- the CS&#x2F;EE analysis is sound; its truthfulness is only suspect because of the DoD&#x27;s claimed successes.<p>A more truthful &quot;computer bugs that killed people&quot; example would be the Therac-25 - a machine intended to treat cancer with tightly-focused radiation therapy. Six patients died as a result of massive overdoses of radiation, on the order of 20,000 rads. It was possible for the machine to end up in a state where it delivered full-power radiation without a hardware shield in place to protect the rest of the patient&#x27;s body. No hardware interlocks were used to ensure that the full power mode was only usable with the shield in place - all safety features relied on software. In addition, the bug was only possible when an operator made a mistake in mode selection and then <i>rapidly</i> (proficiently) corrected it - the rapidity required prevented the bug from being discovered during slow, methodic, careful testing.<p>See Hackaday&#x27;s article Killed by a Machine (and associated HN discussion) or for the especially curious, a 49-page post-mortem for more detail:<p><a href="https:&#x2F;&#x2F;hackaday.com&#x2F;2015&#x2F;10&#x2F;26&#x2F;killed-by-a-machine-the-therac-25&#x2F;" rel="nofollow">https:&#x2F;&#x2F;hackaday.com&#x2F;2015&#x2F;10&#x2F;26&#x2F;killed-by-a-machine-the-ther...</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12201147" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12201147</a><p><a href="http:&#x2F;&#x2F;sunnyday.mit.edu&#x2F;papers&#x2F;therac.pdf" rel="nofollow">http:&#x2F;&#x2F;sunnyday.mit.edu&#x2F;papers&#x2F;therac.pdf</a>
otoburbabout 7 years ago
This was a tragic and preventable loss. It&#x27;s incredible that a software bug might have been the root cause.<p>At the time, this incident really stuck out because it broke the illusion of our fabled Patriot missile shield protecting us. Civilian expats really <i>believed</i> the inflated Patriot interception rates parroted to us by mainstream media and our American military expat buddies.<p>A large number of remaining expats who had stuck out the Gulf War to that point decided to pack it in and leave when word got out that the Dhahran barracks were hit. Although history shows that Iraq surrendered days after this incident, at the time there was heightened fear and confusion amongst the remaining expats, especially the non-Americans.<p>We left on the last Lufthansa flight (crewed by military personnel) after hearing about this.<p>Nostalgic edit:<p>During the Gulf War embassies issued equipment and rations to expat citizens who chose to stay behind. Americans were issued full body suits (for adults and youths) due to the biological and chemical weapon payloads that Saddam boasted his SCUDs were carrying, along with MREs that tasted fabulous! In stark contrast, Commonwealth citizens were issued a bare gas mask (adult size only) and mono-flavour MREs that tasted like cardboard.<p>The British embassy sticks out in my mind: with stern stone-faced expressions they admonished us all for not evacuating and thus endangering children in a war zone. In addition to the terrible rations and gas masks, they wordlessly gave us a stack of translucent stickers. When asked what they were for, embassy staff explained that in the event of the air siren going off, we should get under our sturdiest tables and don our gas masks (standard procedure), and <i>then</i> slap the stickers on. If the stickers changed colour, it meant we were in the presence of a biochemical agent and would have approximately 10 seconds before we died a horrific death.<p>You kind of had to be there to appreciate the grim humour.
评论 #16487461 未加载
评论 #16489974 未加载
sharemywinabout 7 years ago
I remember hearing about this in my numerical analysis class.<p>1. I remember hearing the system was only designed for XX operational hours but was being run over the operational spec.<p>2. The time was stored in base 10 so the calculation errors added up over time or something like that so if they had used some base 2 timing scheme it would haven&#x27;t have had issues with rounding errors.<p>My class was in the mid nineties so the details of my 25 year old memory is pretty hazy...at best.
评论 #16486684 未加载
OedipusRexabout 7 years ago
That was a temporary fix, then a software patch was released. I also wouldn&#x27;t call that a &quot;software&quot; fix.
dredmorbiusabout 7 years ago
The inimitable comp.risks discussed this in 1992:<p><a href="http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;13&#x2F;35#subj1.1" rel="nofollow">http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;13&#x2F;35#subj1.1</a><p><a href="http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;13&#x2F;76#subj8.1" rel="nofollow">http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;13&#x2F;76#subj8.1</a><p>And in 1997:<p><a href="http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;18&#x2F;79#subj9.1" rel="nofollow">http:&#x2F;&#x2F;catless.ncl.ac.uk&#x2F;Risks&#x2F;18&#x2F;79#subj9.1</a>
tntnabout 7 years ago
Despite other comments below, I think that the equivalence drawn between &quot;failed to save&quot; and &quot;killed&quot; reflects an interesting philosophical choice. I don&#x27;t think that this equivalence is universally accepted, even by those who call thinking otherwise fallacious.<p>If an EMT fails to save a victim of a car crash, did he&#x2F;she kill the victim? If the dispatcher misspoke and gave the wrong cross street, delaying aid, did the dispatcher kill them?
评论 #16487129 未加载
logfromblammoabout 7 years ago
For doing a ballistic propagation, you apply a gravitational map in Earth-centered, Earth-fixed (ECEF) geodetic coordinates, then convert to Earth-centered rotating (ECR) geodetic coordinates, because that way you don&#x27;t have to correct for the Coriolis effect. That ECEF-ECR conversion requires a time-of-day parameter.<p>You can use a gravitational map that only accounts for latitude, but it isn&#x27;t as precise.<p>So using an accurate clock is <i>really</i> important if your intent is to hit a missile with a missile.
sjburtabout 7 years ago
This is a completely misleading headline. The Patriot missile was not effective at destroying the Scud [0]. The DoD initially claimed successful intercepts when the missile detonated near the Scud, but it rarely, if ever, actually destroyed the warhead. The only reason there was an illusion of success was that the Scud was also spectacularly unreliable and often broke up on re-entry or failed to detonate. It is a complete falsehood to claim that the Patriot would have prevented this loss of life.<p>[0] <a href="http:&#x2F;&#x2F;www.slate.com&#x2F;articles&#x2F;news_and_politics&#x2F;war_stories&#x2F;2003&#x2F;03&#x2F;patriot_games.html" rel="nofollow">http:&#x2F;&#x2F;www.slate.com&#x2F;articles&#x2F;news_and_politics&#x2F;war_stories&#x2F;...</a>
评论 #16487708 未加载
seorphatesabout 7 years ago
Reboot. Around the same time-frame we gathered the flag for a deployment (fleet admiral) and I was responsible for UNIX systems on the ship. Not long after coming aboard the command came down to reboot all of the systems at midnight, nightly (yes, only the UNIX systems). Being that &quot;But Mister..&quot; never really gets you too far in the military I just rode it iterating through any possible reason for the madness, nightly. I could never come up with a good one. Until now. (ok, perhaps not a &quot;good&quot; reason but crazy enough to count.)<p>It now makes much more sense to me that a (terrible) mishap had occurred and possible prevention was only a reboot away. I can see how being exposed to that context at upper levels could easily cause one to latch onto any perceived preventative measures.<p>I also once saw a short ntp time step across multiple clusters (yeh, simultaneously) shut down half of a wafer factory.<p>Time is important.. but rebooting all your systems at midnight probably will not help you to control it. This especially if there are large, hot, fast objects flying around in the night sky and definitely, really, don&#x27;t do ALL of them at the same time every day .. especially during, you know, battle. &#x2F;pro-tip
评论 #16488311 未加载
bertjkabout 7 years ago
I&#x27;ve often wondered, considering the supposed low accuracy of Scud missiles, (wiki gives it a CEP of 450m) how much of the casualties from that incident were more due to the bad luck of the missile actually hitting its target.
评论 #16486693 未加载
评论 #16486745 未加载
criley2about 7 years ago
This is bad, editorialized title that is not the title of the article.<p>Mods should change this. The &quot;software fix&quot; was a software patch which corrected the clocking bug.<p>The &quot;software workaround&quot; to use pre-fix was reboot.<p>I hate editorialized, lying titles :(
评论 #16487450 未加载
leggomylibroabout 7 years ago
I could be reading this wrong, but 1&#x2F;3 of a second within 100 hours seems really good, like something you&#x27;d get from a temperature-controlled crystal oven.<p>I don&#x27;t mean to second-guess them in an area I know so little about, but if that was enough to cause a serious issue in the span of only a few days, shouldn&#x27;t the devices be designed with a separate synchronization system, at least as a backup? Maybe GPS?<p>Which brings up a sort of interesting question...would a Patriot missile system even have receivers for a weak public signal like GPS, or is it all self-contained?
评论 #16487484 未加载
评论 #16487571 未加载
评论 #16487433 未加载
评论 #16488233 未加载
brohoolioabout 7 years ago
This is depressing. One of my middle school classmates had a brother killed in a SCUD strike.
jimjimjimabout 7 years ago
regarding the comments about bug killed people versus weapon killed people.<p>There is no 1 answer, this argument is a result of black-white&#x2F;yes-no&#x2F;us-them single point of blame thinking. and it&#x27;s terrible.<p>the bug <i>contributed</i> to the loss of life.
macawfishabout 7 years ago
Little things do add up.
mlazosabout 7 years ago
The title of this post is misleading, they eventually supplied a software patch that fixed the clock drift. The Israelis proposed rebooting as a stopgap until the bug could be fixed.
评论 #16487624 未加载
nathan_longabout 7 years ago
&gt; The Patriot missile battery at Dhahran had been in operation for 100 hours, by which time the system&#x27;s internal clock had drifted by one-third of a second. Due to the missile&#x27;s speed this was equivalent to a miss distance of 600 meters.
jasonmaydieabout 7 years ago
The scud missile lead to their deaths, not the software. There&#x27;s no absolute guarantee it would have intercepted it, plus rebooting a deployed machine regularly is an acceptable fix when it&#x27;s live in the field
评论 #16486668 未加载
评论 #16487012 未加载
评论 #16487402 未加载
评论 #16486797 未加载