>faster than a memcpy() OS call<p>I usually don't nitpick terminology but memcpy() is a C language runtime library function and not a Linux/Win32 os call.
> Blosc comes with a pre-filter (also called pre-conditioner) called shuffle which rearranges bytes in a clever way for the compression stage.<p>This sounds like the Burrows-Wheeler transform, which bzip2 uses:<p><a href="https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform" rel="nofollow">https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transf...</a>