TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Parallel decompression of gzip-compressed files

3 点作者 nkrumm将近 6 年前

2 条评论

nkrumm将近 6 年前
GitHub: <a href="https:&#x2F;&#x2F;github.com&#x2F;Piezoid&#x2F;pugz" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Piezoid&#x2F;pugz</a>.<p>From the readme:<p>&quot;Contrary to the pigz program which does single-threaded decompression (see <a href="https:&#x2F;&#x2F;github.com&#x2F;madler&#x2F;pigz&#x2F;blob&#x2F;master&#x2F;pigz.c#L232" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;madler&#x2F;pigz&#x2F;blob&#x2F;master&#x2F;pigz.c#L232</a>), pugz found a way to do truly parallel decompression. In a nutshell: the compressed file is splitted into consecutive sections, processed one after the other. Sections are in turn splitted into chunks (one chunk per thread) and will be decompressed in parallel. A first pass decompresses chunks and keeps track of back-references (see e.g. our paper for the definition of that term), but is unable to resolve them. Then, a quick sequential pass is done to resolve the contexts of all chunks. A final parallel pass translates all unresolved back-references and outputs the file.&quot;
LinuxBender将近 6 年前
Somewhat related, for bzip2, I use pbzip2 which uses all the cores, or as many as you specify. [1] It is in the EPEL repo for RHEL&#x2F;CentOS&#x2F;Fedora.<p>[1] - <a href="https:&#x2F;&#x2F;linux.die.net&#x2F;man&#x2F;1&#x2F;pbzip2" rel="nofollow">https:&#x2F;&#x2F;linux.die.net&#x2F;man&#x2F;1&#x2F;pbzip2</a>