TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Precomp: Further compress files that are already compressed

59 pointsby marcopolisover 4 years ago

8 comments

mappuover 4 years ago
Precomp works by brute-forcing the zlib&#x2F;...&#x2F; parameters used to get byte-identical input; then decompressing; and (optionally) recompressing with a stronger compressor.<p>It was closed-source for a long time and AntiX was an open source version.<p>One interesting use case is for archiving Android ROMs for projects such as LineageOS. Each ROM is a zip file, they differ in bytes almost entirely and storing a long amount of history takes a lot of space. But under precomp, the differences between two nightly builds are quite small, and they can be packed together into a solid archive for a significant (90%+) space saving.
schnaaderover 4 years ago
Author of Precomp here. First of all, thanks for the attention, the sudden rise of the GitHub stats surprised me, but now I know the reason. So some comments from my side here, I&#x27;ll answer some of the threads, feel free to ask questions.<p>The project has been around quite long. I started with it in 2006, but as I&#x27;m basically working on it alone (though it got better with the change to open source) and don&#x27;t have much spare time (studying, work, father of two kids) updates are less frequent than I&#x27;d wish to.<p>The upside of this long time is that the program itself is quite stable, so e.g. it&#x27;s hard to find data that leads to a crash or incorrect decompression that is not specially crafted for this purpose.<p>The biggest challenge at the moment is the codebase that is very monolithic (one big precomp.cpp) and mixes newer parts (e.g. the preflate integretation) with old C style code. On the plus side of things, the code is platform independent (in the branch, it even compiles on ARM) and compiling should be no problem using CMake.<p>Another thing missing because of not-much-time is the documentation. There&#x27;s some basic information in the README and the program syntax reveals the meaning of most parameters, but there could be much more doc. Very much information can be found at the encode.su forum, but of course, this is very unstructured and often related to bugs, questions about the program&#x2F;algorithm or problems on certain files.<p>That said, just throw your data at Precomp and see how it performs. Both ratio and duration heavily depends on what data is fed in, but since some of the supported streams like zLib&#x2F;deflate or jpg are used everywhere, there are many (sometimes surprising) examples like APK packages and Linux Distribution images where it outperforms the usual compressors like 7-Zip. And last, but not least, the usual GitHub things apply: feel free to check out the existing issues, create new ones, play with the source code, fork it, create pull requests.
cl3mischover 4 years ago
Shouldn&#x27;t the name be &quot;postcomp&quot; if it&#x27;s for files which already are compressed?
评论 #24956057 未加载
评论 #24955629 未加载
评论 #24955332 未加载
评论 #24960190 未加载
评论 #24956095 未加载
pimlottcover 4 years ago
Haven&#x27;t tried this, but another related tool that works great for re-compressing existing files is AdvanceCOMP [0]. It can lossless optimize zip, png and mng files in-place, which is particularly handy for zip files.<p>I used this on a project to drastic reduce the size of generated Powerpoint decks, in order to stay below the email attachment limits for our distro list recipients. Very handy!<p>0: <a href="http:&#x2F;&#x2F;www.advancemame.it&#x2F;comp-readme" rel="nofollow">http:&#x2F;&#x2F;www.advancemame.it&#x2F;comp-readme</a>
评论 #24957945 未加载
SNosTrAnDbLeover 4 years ago
I could not find any stats or compression ratios in there. Also, the backward compatibility disclaimer makes me worried. If I use a v1 to precompress the file, then I will need to keep the v1 version alongwith the precompressed file or I risk loosing the data.
评论 #24960360 未加载
评论 #24955172 未加载
olliejover 4 years ago
So this is essentially an optimizer (eg no bogus recursive compression claims). It decompressed specific regions of certain file formats with better algorithms.<p>A hypothetical (simpler less powerful) equivalent would be a program that read gzip files that have the lowest compression ratio, and then recompressed with gzip set to the highest level.
评论 #24955551 未加载
pabs3over 4 years ago
This reminds me of Debian&#x27;s pristine-tar:<p><a href="https:&#x2F;&#x2F;kitenet.net&#x2F;~joey&#x2F;code&#x2F;pristine-tar&#x2F;" rel="nofollow">https:&#x2F;&#x2F;kitenet.net&#x2F;~joey&#x2F;code&#x2F;pristine-tar&#x2F;</a> <a href="https:&#x2F;&#x2F;joeyh.name&#x2F;blog&#x2F;entry&#x2F;generating_pristine_tarballs_from_git_repositories&#x2F;" rel="nofollow">https:&#x2F;&#x2F;joeyh.name&#x2F;blog&#x2F;entry&#x2F;generating_pristine_tarballs_f...</a>
mchusmaover 4 years ago
Is the novelty that it only works on certain filetypes, so is optimized for those?
评论 #24955447 未加载