TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Now using Zstandard instead of xz for package compression

271 pointsby nloomansover 5 years ago

23 comments

Lammyover 5 years ago
Meta: This post is yet another victim of the HN verbatim title rule despite the verbatim title making little sense as one of many headlines on a news page.<p>How is &quot;Now using Zstandard instead of xz for package compression&quot; followed by the minuscule low-contrast grey &quot;(archlinux.org)&quot; better than &quot;Arch Linux now using Zstandard instead of xz for package compression&quot; like it was when I originally read this a few hours ago?
评论 #21960884 未加载
评论 #21959992 未加载
评论 #21960984 未加载
评论 #21961915 未加载
评论 #21961497 未加载
评论 #21961469 未加载
评论 #21960296 未加载
评论 #21959919 未加载
WinonaRyderover 5 years ago
Zstandard is awesome!<p>Earlier last year I was doing some research that involved repeatedly grepping through over a terabyte of data, most of which were tiny text files that I had to un-zip&#x2F;7zip&#x2F;rar&#x2F;tar and it was painful (maybe I needed a better laptop).<p>With Zstd I was able to re-compress the whole thing down to a few hundred gigs and use ripgrep which solved the problem beautifully.<p>Out of curiosity I tested compression with (single-threaded) lz4 and found that multi-threaded zstd was pretty close. It was an unscientific and maybe unfair test but I found it amazing that I could get lz4-ish compression speeds at the cost of more CPU but with much better compression ratios.<p>EDIT: Btw, I use arch :) - yes, on servers too.
评论 #21964582 未加载
filereaperover 5 years ago
Apparently this is how to use Zstd with tar if anyone else was wondering:<p><pre><code> tar -I zstd -xvf archive.tar.zst </code></pre> <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;45355277&#x2F;how-can-i-decompress-an-archive-file-having-tar-zst" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;45355277&#x2F;how-can-i-decom...</a><p>Hopefully there&#x27;s another option added to tar that simplifies this if this compression becomes mainstream.
评论 #21959587 未加载
评论 #21960774 未加载
评论 #21960082 未加载
评论 #21959711 未加载
评论 #21959694 未加载
cmurfover 5 years ago
Fedora 31 switched RPM to use zstd. <a href="https:&#x2F;&#x2F;fedoraproject.org&#x2F;wiki&#x2F;Changes&#x2F;Switch_RPMs_to_zstd_compression" rel="nofollow">https:&#x2F;&#x2F;fedoraproject.org&#x2F;wiki&#x2F;Changes&#x2F;Switch_RPMs_to_zstd_c...</a><p>Package installations are quite a bit faster, and while I don&#x27;t have any numbers I expect that the ISO image compose times are faster, since it performs an installation from RPM to create each of the images.<p>Hopefully in the near future the squashfs image on those ISOs will use zstd, not only for the client side speed boost for boot and install, but it cuts the CPU hit for lzma decompression by a lot (more than 50%). <a href="https:&#x2F;&#x2F;pagure.io&#x2F;releng&#x2F;issue&#x2F;8581" rel="nofollow">https:&#x2F;&#x2F;pagure.io&#x2F;releng&#x2F;issue&#x2F;8581</a>
m4rtinkover 5 years ago
BTW, Fedora recently switched to zstd compression for its packages as well. For the same resons basically - much better overall de&#x2F;compression speed while keeping the result mostly the same size.<p>Also one more benefit of zstd compression, that is not widely noted - a zstd file conpressed with multiple threads is binary the same as file compressed with single thread. So you can use multi threaded compression and you will end up with the same file cheksum, which is very important for package signing.<p>On the other hand xz, which has been used before, produces a <i>binary different file</i> if compressed by single or multiple threads. This basucally precludes multi threaded compression at package build time, as the compressed file checksums would not match if the package was rebuild with a different number of compression threads. (the unpacked payload will be always the same, but the compressed xz file <i>will</i> be binary different)
ncmncmover 5 years ago
Zstd has an enormous advantage in compression and, especially, decompression speed. It often doesn&#x27;t compress <i>quite</i> as much, but we don&#x27;t care as much as we once did. We rebuild packages more than we once did.<p>This looks like a very good move. Debian should follow suit.
评论 #21959437 未加载
kbumsikover 5 years ago
&gt; Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.<p>Impressive. As a AUR package maintainer I am also wondering how the compression speed is though.
评论 #21960162 未加载
评论 #21961004 未加载
评论 #21961775 未加载
JeremyNTover 5 years ago
I learned about this one the hard way when I went to update a really crufty (~ 1 year since last update) Arch system I use infrequently the other day. I had failed to update my libarchive version prior to the change and the package manager could not process the new format.<p>Luckily updating libarchive manually with an intermediate version resolved my issue and everything proceeded fine.<p>This is a good change, but it&#x27;s a reminder to pay attention to the Arch Linux news feed, because every now and then something important will change. The maintainers provided ample warning about this change there (and indeed I had updated by other systems in response) so we procrastinators really had no excuse :)
golergkaover 5 years ago
I used zstd for on-the-fly compression of game data for p2p multiplayer synchronization, and got 2-5x as much data (depends on the payload type) in each TCP packet. Sad that it still doesn&#x27;t get much adoption in the industry.
评论 #21960210 未加载
loegover 5 years ago
I&#x27;d love to see Zstandard accepted in other places where the current option is only the venerable zlib. E.g., git packing, ssh -C. It&#x27;s got more breadth and is better (ratio &#x2F; cpu) than zlib at every point in the curve where zlib even participates.
评论 #21960489 未加载
评论 #21960440 未加载
评论 #21960161 未加载
rwmjover 5 years ago
I wish zstd supported seeking and partial decompression (<a href="https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;zstd&#x2F;issues&#x2F;395#issuecomment-535875379" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;zstd&#x2F;issues&#x2F;395#issuecomment-535...</a>). We could then use it for hosting disk images as it would be a lot faster than xz which we currently use.
评论 #21960888 未加载
gravitasover 5 years ago
AUR users -- the default settings in &#x2F;etc&#x2F;makepkpg.conf (delivered by the pacman package as of 5.2.1-1) are still at xz, you must manually edit your local config:<p><pre><code> PKGEXT=&#x27;.pkg.tar.zst&#x27; </code></pre> The largest package I always wait on perfect for this scenario is `google-cloud-sdk` (the re-compression is a killer -- `zoom` is another one in AUR that&#x27;s a beast) so I used it as a test on my laptop here in &quot;real world conditions&quot; (browsers running, music playing, etc.). It&#x27;s an old Dell m4600 (i7-2760QM, rotating disk), nothing special. What matters is using default xz, compression takes twice as long and <i>appears</i> to drive the CPU harder. Using xz my fans always kick in for a bit (normal behaviour), testing zst here did not kick the fans on the same way.<p>After warming up all my caches with a few pre-builds to try and keep it fair by reducing disk I&#x2F;O, here&#x27;s a sampling of the results:<p><pre><code> xz defaults - Size: 33649964 real 2m23.016s user 1m49.340s sys 0m35.132s ---- zst defaults - Size: 47521947 real 1m5.904s user 0m30.971s sys 0m34.021s ---- zst mpthread - Size: 47521114 real 1m3.943s user 0m30.905s sys 0m33.355s </code></pre> I can re-run them and get a pretty consistent return (so that&#x27;s good, we&#x27;re &quot;fair&quot; to a degree); there&#x27;s disk activity building this package (seds, etc.) so it&#x27;s not pure compression only. It&#x27;s a scenario I live every time this AUR package (google-cloud-sdk) is refreshed and we get to upgrade. Trying to stick with real world, not synthetic benchmarks. :)<p>I did not seem to notice any appreciable difference in adding the `--threads=0` to `COMPRESSZST=` (from the Arch wiki), they both consistently gave me right around what you see above. This was compression only testing which is where my wait time is when upgrading these packages, huge improvement with zst seen here...
评论 #21961796 未加载
maxpertover 5 years ago
I’ve used LZ4 and Snappy in production for compressing cache&#x2F;mq payloads. This is on a service serving billions of clicks in a day. So far very happy with the results, I know zstd requires more CPU than LZ4 or snappy on average but has someone used it under heavy traffic loads on web services. I am really interested trying it out but at the same time held back by “don’t fix it if it ain’t broken”.
评论 #21960197 未加载
评论 #21960124 未加载
G4Eover 5 years ago
For those who want a TLDR : The trade off is 0.8% increase of package size for 1300% increase in decompression speed. Those numbers come from a sample of 542 packages.
评论 #21959200 未加载
Phlogiover 5 years ago
The wiki is already up to date if you build your own or AUR packages and want to use multiple cpu cores <a href="https:&#x2F;&#x2F;wiki.archlinux.org&#x2F;index.php&#x2F;Makepkg#Utilizing_multiple_cores_on_compression" rel="nofollow">https:&#x2F;&#x2F;wiki.archlinux.org&#x2F;index.php&#x2F;Makepkg#Utilizing_multi...</a>
yjftsjthsd-hover 5 years ago
&gt; If you nevertheless haven&#x27;t updated libarchive since 2018, all hope is not lost! Binary builds of pacman-static are available from Eli Schwartz&#x27; personal repository, signed with their Trusted User keys, with which you can perform the update.<p>I am a little shocked that they bothered; Arch is rolling release and explicitly does not support partial upgrades (<a href="https:&#x2F;&#x2F;wiki.archlinux.org&#x2F;index.php&#x2F;System_maintenance#Partial_upgrades_are_unsupported" rel="nofollow">https:&#x2F;&#x2F;wiki.archlinux.org&#x2F;index.php&#x2F;System_maintenance#Part...</a>). So to hit this means that you didn&#x27;t update a rather important library for over a year, which officially implies that you didn&#x27;t update <i>at all</i> for over a year, which... is unlikely to be sensible.
评论 #21959102 未加载
评论 #21959232 未加载
评论 #21959135 未加载
评论 #21959771 未加载
评论 #21959262 未加载
评论 #21960231 未加载
shmerlover 5 years ago
Was XZ used in parallelized fashion? Otherwise comparing is kind of pointless. Single threaded XZ decompression is way too slow.
评论 #21960477 未加载
评论 #21959650 未加载
评论 #21959521 未加载
评论 #21960902 未加载
zerogaraover 5 years ago
Most of the results published show very little positive or negative speed in decompression, where is all this -1300% coming from?<p>edit: Sorry, my fault that was decompression RAM I was thinking about, not speed, although I was influenced by my test that without measuring both xz and zstd seemed instant.
dhsysusbsjsiover 5 years ago
Quick shout out to LZFSE. Similar compression ratio to zlib but much faster.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;lzfse&#x2F;lzfse" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lzfse&#x2F;lzfse</a>
nwah1over 5 years ago
I wonder if they will switch to using zstd for mkinititcpio
评论 #21959679 未加载
评论 #21960227 未加载
imtringuedover 5 years ago
This blog post probably wasted more of my time than I will ever gain from the faster decompression...
vmchaleover 5 years ago
What of lzip?
Annatarover 5 years ago
I couldn&#x27;t care less about decompression speed, because the bottleneck is the network, which means that I want my packages as small as possible. Smaller packages mean faster installation; at 54 MB&#x2F;s or faster decompression rate of xz, I couldn&#x27;t care less about a few milliseconds saved during decompression. For me, this decision is dumbass stupid.
评论 #21961989 未加载
评论 #21961059 未加载
评论 #21964008 未加载
评论 #21961577 未加载