Last time this came up on HN, I did some research, and discovered that <i>lzip</i> was quite non-robust in the face of data corruption: a single bit flip in the right place in an lzip archive could cause the decompressor to silently truncate the decompressed data, <i>without</i> reporting an error. Not only that, this vulnerability was a direct consequence of one of the features used to claim superiority to XZ: namely, the ability to append arbitrary “trailing data” to an lzip archive without invalidating it.<p>Like some other compressed formats, an lzip file is just a series of compressed blocks concatenated together, each block starting with a magic number and containing a certain amount of compressed data. There’s no overall file header, nor any marker that a particular block is the last one. This structure has the advantage that you can simply concatenate two lzip files, and the result is a valid lzip file that decompresses to the concatenation of what the inputs decompress to.<p>Thus, when the decompressor has finished reading a block and sees there’s more input data left in the file, there are two possibilities for what that data could contain. It could be another lzip block corresponding to additional compressed data. Or it could be <i>any other</i> random binary data, if the user is taking advantage of the “trailing data” feature, in which case the rest of the file should be silently ignored.<p>How do you tell the difference? Simply enough, by checking if the data starts with the 4-byte lzip magic number. If the magic number itself is corrupted in any way? Then the entire rest of the file is treated as “trailing data” and ignored. I hope the user notices their data is missing before they delete the compressed original…<p>It might be possible to identify an lzip block that has its magic number corrupted, e.g. by checking whether the trailing CRC is valid. However, at least at the time I discovered this, lzip’s decompressor made no attempt to do so. It’s possible the behavior has improved in later releases; I haven’t checked.<p>But at least at the time this article was written: pot, meet kettle.