re: <a href="https://github.com/uploadcare/pillow-simd#why-do-not-contribute-simd-to-the-original-pillow" rel="nofollow">https://github.com/uploadcare/pillow-simd#why-do-not-contrib...</a><p>They can create three versions of each affected function (fallback, SSE4 and AVX2), place them in separate files (one file for each set of compiler flags), compile each version with its own compiler flags, then link them all together, and in the main module (which is compiled for generic cpu) run cpuid and set global function pointers to the right function implementation.<p>Then always use the global function pointer to call the right implementation of the function, and only expose calling the global function pointer if the function is exported from a shared library.<p>They do need to make sure that function pre and post conditions are preserved in all versions and that memory alignment/layout required by optimized functions is created by the generic code.<p>I think x264 does this.