I have played with a C++ implementation of RFC6330, and while I have to say I never went for performance, I find this benchmark a bit... pointless.<p>RaptorQ is kind of a gaussian elimination of a matrix, so it all depends on the block size (=>matrix size).
The algorithm has basically cubic complexity on the number of symbols in a block.
RFC6330 is made to work on files, which are divided into blocks with a certain number of symbols, and the bytes are interleaved.<p>This implementation does not do the (complex and almost pointless) interleaving, which is fine, even OpenRQ does not.<p>The bench seems to be done on a.... 10kb file?
It all fits in the L2. We are not given the symbol size (which determines the block size!) and I assume all of this fits in a 10x10 matrix.<p>You are benchmarking operations on a matrix that is (more or less) a 10x10 byte matrix.<p>The biggest part of this benchmark might almost be the generation of repair symbols (was it even done?), since that would require multiple xoring of the above-mentioned symbols.<p>This is much closer to micro-benchmarking than an actual benchmark, imho. It would have been more interesting to see what happens with files at least larger than the L3 cache.<p>You can also cache intermediate results, (which he does not do) which is especially useful for encoding, but only when working on matrix >= 100x100, otherwise just searching the cache, getting from memory (my implementation optionally did LZ4 compression/decompression) and doing a matrix multiplication is slower than just computing the matrix again.<p>Still, it's nice to see implementations of the RFC, which is a real pain to read...