The kicker:<p>"the reduction function is called with num set to the bit size, where it should be number of BN_ULONG elements (which are always 8 bytes large, because that is the size of an unsigned long on x64 systems, which is the only architecture which can have AVX512 support). So with the input sizes being 1024 bits, 8192 bytes are accessed (read from or written to) instead of 128."<p>Really unfortunate that a performance optimization like this introduced RCE. Feels like something you would hope would be caught via the use of something like asan/msan or valgrind, at least it was caught relatively quickly after release via fuzzing.<p>A good bit of news is that since this requires AVX512 many CPUs won't hit it, including new Intel chips: <a href="https://www.pcgamer.com/intel-kills-alder-lake-avx-512-support-for-good/" rel="nofollow">https://www.pcgamer.com/intel-kills-alder-lake-avx-512-suppo...</a>