TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Lossless decompression and the generation of random samples

44 点作者 joelg236将近 12 年前

3 条评论

DanBC将近 12 年前
&gt; <i>Huffman coding tries to compress text one letter at a time on the assumption that each letter comes from some fixed and known probability distribution. If the algorithm is successful then we&#x27;d expect the compressed text to look like a uniformly distributed sequence of bits. If it didn&#x27;t then there&#x27;d be patterns that could be used for further compression.</i><p>This can be gently confusing when you&#x27;re using different compression systems, (bits vs bytes)<p>(<a href="https://groups.google.com/d/topic/lz4c/DcN5SgFywwk/discussion" rel="nofollow">https:&#x2F;&#x2F;groups.google.com&#x2F;d&#x2F;topic&#x2F;lz4c&#x2F;DcN5SgFywwk&#x2F;discussio...</a>)<p>Someone is compressing very large log files. They then compressed the output, and got further reductions in size.<p>&gt; <i>The fundamental reason is that these highly repetitive byte sequences, with very small and regular differences, produce repetitive compressed sequences, which can therefore be compressed further.</i> - Yann Collet
hadronzoo将近 12 年前
The same is true of arithmetic coding, which separates the probability model from the encoding process. Feed an arithmetic coder a stream of random bits and it will efficiently sample from your model. See section 6.3: <a href="http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/105.124.pdf" rel="nofollow">http:&#x2F;&#x2F;www.inference.phy.cam.ac.uk&#x2F;mackay&#x2F;itprnn&#x2F;ps&#x2F;105.124....</a>
Chris2048将近 12 年前
I simply don&#x27;t understand this, I don&#x27;t think I have the right background. Any good starting places for finding more about this topic?
评论 #6178361 未加载
评论 #6178370 未加载
评论 #6178582 未加载