Woops, that was a bad link.<p><a href="http://catid.mechafetus.com/news/news.php?view=281" rel="nofollow">http://catid.mechafetus.com/news/news.php?view=281</a><p>Ideas for Online Codes
[Comment]
by catid posted (>30 days ago) 9:35pm Thu. Jan 26th 2012 PST
Online Codes are an unpatented approach to rateless forward error correction (FEC) codes from a 2002 paper by Petar Maymounkov. I first learned about Online Codes from a wikipedia article and since then I have been reading as much as I can about how to implement them with good performance.<p>Since that paper was published, Luby et al have made fantastic advancements in rateless error codes, culminating in RaptorQ last year. Unfortunately they decided to patent their algorithms, which makes their wonderful algorithms worthless for almost everyone on the Internet. So, roll back the clock by a decade and start over. <i>sigh</i><p>Online Codes are a great place to start. I like the layered approach where there is an inner code and an outer code. After reading about a lot of modern implementations I have some ideas to try and see what will help Online Codes reach good performance:<p>(1) Peeling decoder. This is an unpatented approach to decoding sparse low-density parity check (LDPC) codes. It's O(N) and fast, and is pretty easy to get running.<p>(2) Gaussian elimination decoder. This is also an unpatented approach used to solve linear systems of equations when the equations are not sparse.<p>(3) Combine (1) and (2). This is an unpatented approach that provides maximum likelihood decoding in roughly O(N) for good performance. I noticed that the "Raptor Codes" monograph includes a description of this algorithm, so it must be excellent in practice. I've implemented it myself and managed significant performance (300 MB/s) before optimization. I have a few ideas for how to optimize this:<p>+ Reduce the number of variables needed by analyzing how data is passed between parts of the algorithm and eliminating unneeded data.
+ Profiling the application to see where hot spots are, and storing pre-processing in parts of the code that are taking up less time.
+ When back-substituting the Gaussian elimination decoded (2) symbols back into the peeling decoded (1) symbols, the check matrix is pretty dense. Borrowing a trick from windowed modular exponentiation, combinations of bits can be precomputed and then stored in a table to greatly reduce the number of XORs required. I think this can be implemented without any additional memory since at that point in the algorithm many of the received rows have been copied over to the output file.
+ Reduce the number of memory copies by waiting for memcpy() until back-substitution and using memxor() where possible to avoid the copy.<p>(4) Use two inner codes instead of one. The second inner code will be based on GF(256) octets instead of GF(2) like the rest of the check matrix. Since the normal overhead is low, adding just a few more check matrix rows in a higher field should help out the recovery properties a lot. This seems like a logical extension to the layered approach of Online Codes.<p>(5) Decode all codes at once. The outer code might be combined with the inner codes into one large matrix to solve. This would improve decoding performance to that of the maximum likelihood decoder, and should work well with the decoder I'm writing. We'll see...<p>(6) Since A = Decode(Encode(A)) and A = Encode(Decode(A)), the code can be made systematic by precomputing Decode(A) on the transmitter, and then running Encode() forward so that the first symbols sent are equal to the input data file. This can reduce processing time but still allows check symbols to be generated afterwards with good recovery properties. There should be a lot of room for optimization on the transmitter side too since it already knows what the output should be.<p>(7) Raptor codes are using some kind of "permanent inactivation" thing that makes the outer code more complicated but apparently helps with the recovery properties when the number of symbols is lower. Might be patented and unusable, we'll see..