If Intel had shipped a library/compiler that <i>did</i> just use feature flags and didn't check the CPU vendor, and the resulting code used features that on AMD ran much more slowly than the equivalent unoptimized code, would people blame AMD for the slow instructions, or blame Intel for releasing a library/compiler that they didn't optimize for their competitor's processor?<p>This isn't a hypothetical; quoting <a href="https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set" rel="nofollow">https://en.wikipedia.org/wiki/X86_Bit_manipulation_instructi...</a> :<p>> AMD processors before Zen 3[11] that implement PDEP and PEXT do so in microcode, with a latency of 18 cycles rather than a single cycle. As a result it is often faster to use other instructions on these processors.<p>There's no feature flag for "technically supported, but slow, don't use it"; you have to check the CPU model for that.<p>All that said, the <i>right</i> fix here would have been to release this as Open Source, and then people could contribute optimizations for many different processors. But that would have required a decision to rely on winning in hardware quality, rather than sometimes squeezing out a "win" via software even in generations where the hardware quality isn't as good as the competition.