The event they are describing is not a random event (sometimes you should have gotten a 0 but you got a 1 or vice versa) but instead they discovered that a particular CPU core was defective and consistently got a certain math operation wrong.<p>Other than "disable the bad core" it seems the right mitigation is "RMA the sucker" because I am sure FB has a good process for that.