TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Lz_xor

156 pointsby woooshalmost 3 years ago

7 comments

terrellnalmost 3 years ago
This is very interesting!<p>I&#x27;ve tried something somewhat similar in the past. I was looking at implementing an extremely fast decompressor, with ratio similar to LZ4. I was able to get 2x the decompression speed of LZ4, but struggled with compression ratio. The idea was to have 16 byte matches, and allow the matches to apply a 16-bit mask, telling whether each byte is part of the match or a literal. Then I restricted the compressor to only be able to use 16 distinct masks.<p>This was extremely fast to decompress, because each 16-byte match is: load the 16-byte match into an AVX2 register, load 16 bytes of literals, load the mask you&#x27;re using, shuffle the literals, then blend the literals and the match. And because the matches are fixed size, you can start the fetch for multiple matches in parallel.<p>However, the problem I ran into, and would love to solve, is that I also wanted fast-ish compression speed. And it is very hard to search for good matches quickly. Since you have holes in the match.<p>I guess the author is looking at GPU compression, so they are taking a somewhat brute-force approach. But I&#x27;d be interested to see how they&#x27;re doing the match finding, and what kind of speed they&#x27;re getting.
评论 #32410678 未加载
hyperpapealmost 3 years ago
The initial link in this piece, Compression is Compilation, was really helpful to read before reading this piece: <a href="http:&#x2F;&#x2F;richg42.blogspot.com&#x2F;2015&#x2F;10&#x2F;compression-is-compilation.html" rel="nofollow">http:&#x2F;&#x2F;richg42.blogspot.com&#x2F;2015&#x2F;10&#x2F;compression-is-compilati...</a>. I think I got more out of it than this piece.
评论 #32408433 未加载
p4bl0almost 3 years ago
The idea that a compressor is a compiler made me think back of Chaitin&#x27;s work on incompleteness and the limits of mathematics, which I wrote a review of a few years ago: <a href="https:&#x2F;&#x2F;p4bl0.net&#x2F;shebang&#x2F;the-limits-of-mathematics.html" rel="nofollow">https:&#x2F;&#x2F;p4bl0.net&#x2F;shebang&#x2F;the-limits-of-mathematics.html</a>
sedatkalmost 3 years ago
I don&#x27;t understand why the author encodes every literal byte as a separate instruction while in reality, they&#x27;re just consecutive bytes?<p>The whole:<p><pre><code> LIT 13 LIT 24 LIT 65 LIT 32 ... </code></pre> could have been written as a single instruction:<p><pre><code> LIT [13, 24, 65, 32, ...] </code></pre> It&#x27;s almost as if author tries too hard to support their point that their variant looks better.<p>&quot;Notice how much faster it makes progress through the file vs. LZSS&quot;<p>Yeah, because you encode every literal separately? It&#x27;s all LIT instructions?
评论 #32406566 未加载
pcwaltonalmost 3 years ago
Can a XOR-only decompressor do the standard LZ77 trick of encoding many repeated bytes with a copy operation that refers to not-yet-decompressed data? It seems to me that you&#x27;d have to have a lot of 0 patch bytes for that. Huffman coding could encode all the 0 bytes in 1 bit each, but that still loses over regular LZ77. It seems to me that you still might want to keep &quot;COPY&quot; around for RLE-style data.
评论 #32406263 未加载
TazeTSchnitzelalmost 3 years ago
I wonder how an LZ_ADD-compressed bitmap would compare to a PNG. PNG is basically just DEFLATE on a bitmap, but you first apply some additional encoding to each row to help the compressor, by expressing the row as a kind of difference from the previous row. It would be cool if that became unnecessary.
cbsmithalmost 3 years ago
Why am I just finding out about this now?