Yup. SGX, TSX, all the interesting and complicated stuff seems to be getting deprecated after half a decade or more of "We got it! No, wait, we didn't... uh, this time we got it! Wait, crap, no... uh... but <i>this</i> time! Oh carp. Yeah, you know, screw it."<p>After several of those "Release, revert" cycles, it ends up as a self fulfilling prophecy anyway - it's like the sentiment towards Google's new products you see often: "This, too, shall rapidly pass when they get bored." After you've seen TSX disabled on a few generation of chips, the motivation to put the work in to make something work with TSX just kind of evaporates, because you've no confidence that it'll actually work, or stay working, on hardware you want to run on. And because of the requirement to have a fallback path, TSX is a good bit more work, and, often, requires more complexity than a simple lock based approach that's good enough and simple to understand/validate.<p>But my deeper concern is that it seems that nobody at Intel is capable of understanding all the interaction in the chip anymore - and SGX offers very strong evidence of this inability.<p>SGX made the strong claim that, when deployed, a <i>fully malicious ring 0 operating system</i> could neither observe anything about the state of the compute happening in the enclave, nor modify the operation of that. They did various interesting things with how pages were swapped out to prevent replay attacks, and really did try to build it such that you couldn't mess with it. But they did these things at a high level, and didn't fully understand the nature of the chip.<p>The L1TF (L1 Terminal Fault, also known as Foreshadow) attacks took advantage of the edge case L1 cache behavior to speculate out out anything that was in L1 cache, <i>which included SGX enclave data.</i> If I remember properly, because you could read out the stored register state as well as memory pages you faulted in, they demonstrated you could essentially single step a production SGX enclave with full register state and full memory state at every single instruction. Whoops.<p>It's not hard to mitigate once you know the problem - just flush L1 entirely on exit. But Intel didn't know it was a problem, so they didn't do that.<p>On the flip side, "influencing operation," there was Plundervolt. This involved the OS using an <i>undocumented</i> (grumble growl) MSR to reduce the voltage of the chip for improving efficiency of operation. However, the OS (that untrusted ring 0 thing...) has control over this register. And there aren't sane limits on it, such that the OS can drop the voltage enough that things like "multiply" and "AES operations" start faulting and glitching (silently), without being low enough that the chip stops functioning. Enter an enclave in this state, wait for multiply or AES to fault in the useful ways they will, and you've just influenced operation such that you can pull keys out. Whoops.<p>Again, it's not hard to mitigate. Refuse to enter if the voltage isn't at stock settings (you can't just reset it on entry because it takes time for the VRMs to bring the voltage back up). But <i>Intel didn't do this.</i> The people who added this neat little efficiency hack and then kept it secret never rubbed the right way with the people in charge of the new flagship security features around the sort of adversarial thinkers who can ask "Now, wait a minute, what if I push this beyond sane bounds?"<p>You can point at the other speculative stuff and claim it's not <i>really</i> a problem because architectural behavior is correct (I think that sort of reasoning is rubbish, when you can speculate your way past all security boundaries on the chip), but the SGX case, specifically, demonstrates that Intel didn't know about the problems or they would have taken the very simple mitigation steps. And <i>that</i> tells me that they can't reason about their chips as a whole.<p>... and <i>that</i> - hardware companies of the most critical components of the system not having a full understanding of how they operate - is scary. The foundation of everything is in an unknown state, and nobody knows how broken it is until some researchers go in and figure it out.<p>More than once, after fixing the exact thing the researchers found, Intel has also had egg on their face of the "... so we found this very, very closely related, conceptually identical bug that they didn't fix with the last patches..." variety. It seems safe to say that there are university students and faculty who understand the security implications of Intel's design decision better than the people at Intel in charge of such things.<p>We're running, very rapidly, out of "complexity runway." Everything, from the very chips on up, is so complex that nobody can reason about it, and the only solution to the very problems caused by complexity is, "Well, let's add more complexity to fix those problems." It's not the sort of thing that can go on forever.<p>Anyway. </rant about the state of Intel>