we had a series of large 18 card modular routers fail recently that were installed in different places around the world at roughly the same period. randomly one of these would do a total reboot and/or fail over to it's backup control plane card. we checked everything and sent the routers back to the manufacturer lab to analyze. it ended up the root problem was that an information sticker was placed incorrectly on the power supply modules causing their connectors to occasionally short into the chassis of the router. these large systems are so complex today I think it will be the death of large routing hardware and a more distributed model of smaller routers will take over, I think it already has in many cloud architectures.
T'was a nice story!<p>I finished reading "When Sysadmins Ruled the Earth" by Cory Doctorow yesteray. Looks like this weekend gonna be Sci-Fi short stories weekend.