What a great historical summary. Compression has moved on now but having grown up marveling at PKZip and maximizing usable space on very early computers, as well as compression in modems (v42bis ftw!), this field has always seemed magical.<p>These days it generally is better to prefer Zstandard to zlib/gzip for many reasons. And if you need seekable format, consider squashfs as a reasonable choice. These stand on the shoulders of the giants of zlib and zip but do indeed stand much higher in the modern world.
Fun fact: in a sense. gzip can have multiple files, but not in a specially useful way ...<p><pre><code> $ echo meow >cat
$ echo woof > dog
$ gzip cat
$ gzip dog
$ cat cat.gz dog.gz >animals.gz
$ gunzip animals.gz
$ cat animals
meow
woof</code></pre>
Interesting -- I did not realize that the zip format supports lzma, bzip2, and zstd. What software supports those compression methods? Can Windows Explorer read zip files produced with those compression methods?<p>(I have been using 7zip for about 15 years to produce archive files that have an index and can quickly extract a single file and can use multiple cores for compression, but I would love to have an alternative, if one exists).
Found this hilarious:<p>> This post is packed with so much history and information that I feel like some citations need be added<p>> I am the reference<p>(extracted a part of the conversation)
Is there an archive format that supports appending diff's of an existing file, so that multiple versions of the same file are stored? PKZIP has a proprietary extension (supposedly), but I couldn't find any open version of that.<p>(I was thinking of a creating a version control system whose .git directory equivalent is basically an archive file that can easily be emailed, etc.)
If you are interested in implementation details, how to unpack/decompress them, check out these Python implementations:<p>- <a href="https://github.com/onekey-sec/unblob/blob/main/unblob/handlers/archive/zip.py">https://github.com/onekey-sec/unblob/blob/main/unblob/handle...</a><p>- <a href="https://github.com/onekey-sec/unblob/blob/main/unblob/handlers/compression/gzip.py">https://github.com/onekey-sec/unblob/blob/main/unblob/handle...</a><p>- <a href="https://github.com/onekey-sec/unblob/blob/main/unblob/handlers/compression/zlib.py">https://github.com/onekey-sec/unblob/blob/main/unblob/handle...</a>
There's also pzip/punzip (<a href="https://github.com/ybirader">https://github.com/ybirader</a>) for those wanting more performant (concurrent) zip/unzip.<p>Disclaimer: I'm the author.
gzip can be used to (de)compress directories recursively in a variable:<p>FOO=$(tar cf - folderToCompress | gzip | base64)<p>echo $FOO | base64 - d | zcat | tar xf -