Whoa - 10 minutes ago, out of the blue, I was poking around on Wikipedia reading about compression algorithms and thinking how it might be a fun side project to mess with something like that. Now I reload my RSS reader and find this!<p>Life is strange sometimes...<p>What I started wondering about is if there's a chance of achieving better compression results if the compressor and decompressor could communicate about the compression before it happened. For example, (ignoring CPU constraints) what if when I was downloading a large file over the web, the web server and the browser conspired together to achieve a higher rate of compression customized just for me by using some kind of shared knowledge (like a history of what was recently downloaded or whatever) to achieve a transfer of less data? (Like, if I had a bunch of files on my client that the server also has, could it use small chunks of those known files and index their location in the new file to be transfered - thus saving me download time?)<p>Obviously this is not a fully formed thought... but I think it might be worth considering that compression need not always be bound by the type of data, but also by the receiver of that data and the intention and method of that data's transmission.
It seems to me that for backup you'd want a file format with redundancy, or that could tolerate a few bit for sector error while remaining mostly readable. This seems to conflict with most compression schemes.