This guy used to go around GNU mailing lists (and others) trying to get us to use lzip.<p><a href="https://gcc.gnu.org/ml/gcc/2017-06/msg00044.html" rel="nofollow">https://gcc.gnu.org/ml/gcc/2017-06/msg00044.html</a><p><a href="https://lists.debian.org/debian-devel/2017/06/msg00433.html" rel="nofollow">https://lists.debian.org/debian-devel/2017/06/msg00433.html</a><p>It was a bit bizarre when he hit the Octave mailing list.<p>Eventually, people just wanted xz back:<p><a href="http://octave.1599824.n4.nabble.com/opinion-bring-back-Octave-xz-source-release-td4683705.html" rel="nofollow">http://octave.1599824.n4.nabble.com/opinion-bring-back-Octav...</a>
Interestingly, since "recovery" is mentioned several times, I decided to test myself.<p>I took a copy of a jpeg image, compressed it different times with either gzip or bzip2, then with a hexeditor modified one byte.<p>The recovery instructions for gzip is to simply do "zcat corrupt_file.gz > corrupt_file". While for bzip2 is to use the bzip2recover command which just dumps the blocks out individually (corrupt ones and all).<p>Uncompressing the corrupt gzip jpeg file via zcat at all times resulted in an image file the same size as the original and could be opened with any image viewer although the colors were clearly off.<p>I never could recover the image compressed with bzip2. Trying to extract all the recovered blocks made by bzip2recover via bzcat would just choke on the single corrupted block. And the smallest you can make a block is 100K (vs 32K for gzip?). Obviously pulling 100K out of a jpeg will not work.<p>Though I'm still confused as to how the corrupted gzip file extracted to a file of the same size as the original. I guess gzip writes out the corrupted data as well instead of choking on it? I guess gzip is the winner here. Having a file with a corrupted byte is much better than having a file with 100K of data missing...
Not that many of the complaints aren't reasonable, but I thought that in general compression/format was orthogonal to parity, which is what I assume is actually wanted for long-term archiving? I always figured that the goal should normally to be able to get back out a bit-perfect copy of whatever went in, using something like Parchive at the file level or ZFS for online storage at the fs level. I guess on the principle of layers and graceful failure modes it's better if even sub-archives can handle some level of corruption without total failure, and from a long term perspective of implementation independence simpler/better specified is preferable, but that still doesn't seem to substitute for just having enough parity built in to both notice corruption and fully recover from it to fairly extreme levels.
No file format is perfect, I've been using xz for years and I can't think of a single issue I have had. The compression rate is dramatically better than gzip or bzip2 for many types of archives (especially when there is a large redundancy, for example when compressing spidered web pages from the same site you can get well over 99% size reduction compared to 70% reduction for gzip, which means using less than one 30th of the disk space).<p>Lately I have been using zstd for some things since it gives good compression and is much faster than xz.<p>This criticism of xz just seems nit picky and impractical, especially if you are compressing tar archives and/or storing the archives on some kind of raid which can correct some read errors (such as raid5).
I remember seeing this article before. This time the reaction that surges for me is: if you want long-term archiving but don't assume redundant storage, it's not going to go well. Put your long-term archives on ZFS.
A bit of speculation here, but perhaps xz won over lzip because it has a real manpage?<p>lzip has the usual infuriating short summary of options with a "run info lzip for the complete manual". Also the source code repository doesn't even seem linked directly from the lzip homepage - technical considerations aren't the only thing that determines if software is "better", it also has to be well presented.
If you first use tar to preserve xattrs/etc.. then you can use anything to compress. xz, bz2, 7z, even arj if you are feeling nostalgic.<p><pre><code> tar cvfJ ./files.tar.xz /some/dir</code></pre>
> <i>"3 Then, why some free software projects use xz?"</i><p>Because the files are usually smaller than gzip, with faster decompression than bzip2, and the library is available on most systems.