TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Lossless decompression and the generation of random samples

44 pointsby joelg236almost 12 years ago

3 comments

DanBCalmost 12 years ago
&gt; <i>Huffman coding tries to compress text one letter at a time on the assumption that each letter comes from some fixed and known probability distribution. If the algorithm is successful then we&#x27;d expect the compressed text to look like a uniformly distributed sequence of bits. If it didn&#x27;t then there&#x27;d be patterns that could be used for further compression.</i><p>This can be gently confusing when you&#x27;re using different compression systems, (bits vs bytes)<p>(<a href="https://groups.google.com/d/topic/lz4c/DcN5SgFywwk/discussion" rel="nofollow">https:&#x2F;&#x2F;groups.google.com&#x2F;d&#x2F;topic&#x2F;lz4c&#x2F;DcN5SgFywwk&#x2F;discussio...</a>)<p>Someone is compressing very large log files. They then compressed the output, and got further reductions in size.<p>&gt; <i>The fundamental reason is that these highly repetitive byte sequences, with very small and regular differences, produce repetitive compressed sequences, which can therefore be compressed further.</i> - Yann Collet
hadronzooalmost 12 years ago
The same is true of arithmetic coding, which separates the probability model from the encoding process. Feed an arithmetic coder a stream of random bits and it will efficiently sample from your model. See section 6.3: <a href="http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/105.124.pdf" rel="nofollow">http:&#x2F;&#x2F;www.inference.phy.cam.ac.uk&#x2F;mackay&#x2F;itprnn&#x2F;ps&#x2F;105.124....</a>
Chris2048almost 12 years ago
I simply don&#x27;t understand this, I don&#x27;t think I have the right background. Any good starting places for finding more about this topic?
评论 #6178361 未加载
评论 #6178370 未加载
评论 #6178582 未加载