One thing to consider when looking at such things is that commercial avionics software systems are full of known limitations. I do not know if this particular 51-day limitation was intentional or not, but in general:<p>Avionics software starts with writing comprehensive requirements. When the software itself is developed based on those requirements, it is then tested against the requirements, always in a real functioning airplane, but also often in smaller airplane-cockpit-like rigs and in purely simulated environments.<p>Nobody is going to write a requirement that says "this avionics subsystem will function without error forever". Even if you thought you could make it happen, you can't test it. So there are going to be boundaries. You might say that the subsystem will function for X days. What happens after that? It may well run just fine for X+1 days, or 2X days, or 100X days. But it's only required to run for X days, and it's only tested and certified for running for X days.<p>I could easily imagine that this particular subsystem was required and certified for some value of X <=51 days, and it just so happened that if the subsystem ran for over 51 days then it started to fail. Or, it could have been a genuine mistake.<p>But even if the intended X wasn't 51 days, there almost certainly was some intended, finite value for X. We might say, "well, my laptop has run for three years without needing a reboot". Great! Is that a guaranteed, repeatable state of operation that the FAA would certify? Probably not. And besides that, do we really want to have to endure a three-year verification test?<p>In most software, we are happy to say, "it <i>should</i> run indefinitely". For avionics software, that's insufficient. We instead say "it <i>will</i> run at least for some specific predetermined finite amount of time" and then back up that statement with certifiable evidence.
<i>For example, let’s imagine that the timestamp set by the transmitting ES is close to its wrap-around value. After performing the required calculation, the receiving ES obtains a timestamp that has already wrapped-around, so it would look like the message had been received before it was actually sent.</i><p>Isn't it surprising that modulo arithmetic, as already employed successfully in TCP sequence numbers and the like, still seems to be incorrectly implemented today? What's more disappointing is seeing all the other incredible systemic complexity they've added, and yet the plane appears to have no mechanical backup instruments?
I remember articles of the Airbus A350 requiring reboots every N days (150ish or so?). I remember the Patriot missile system required a reboot every 24 hours or so until they fixed the software defect which caused the time counting to drift. And I'm pretty sure there are many more such cases where devices fail if kept on for too long, even in spaces where you are supposed to fill out a lot of "paper"work + jump through a lot of defined processes like in avionics, medical, or automotive field, among a good few others (safety and all that).
Reminds me of the LAX Air Traffic Control Shutdown of 2004: <a href="https://m.slashdot.org/story/49885" rel="nofollow">https://m.slashdot.org/story/49885</a>
Why?<p>Was it a cost issue?<p>Or was there an expectation that a regular maintenance check would occur within this time frame that involved a reboot as part of the maintenance check for diagnostics?