科技回声

7 条评论

alecco大约 13 年前

It's good people get interested in the subject. But this is very odd and has some errors. For example xz requires a lot more memory resources than bzip2 (see benchmarks below, Mem column).<a href="http://mattmahoney.net/dc/text.html" rel="nofollow">http://mattmahoney.net/dc/text.html</a><a href="http://mattmahoney.net/dc/uiq/" rel="nofollow">http://mattmahoney.net/dc/uiq/</a>Matt Mahoney mantains the best benchmarks on text and generic compression. Some of the best on the field (like Matt) usually hang out at encode.ru.

评论 #3678434 未加载

评论 #3678969 未加载

bmm6o大约 13 年前

Is this decompressing a single stream on multiple processors? My knowledge of gzip is very limited, but I would have thought sequential processing was required. What's the trick here? (TFA doesn't explain anything, and e.g. pigz homepage doesn't either).

评论 #3678184 未加载

joshbaptiste大约 13 年前

Had to try this on my quad core laptop, as I never heard of these tools .<pre><code> josh@snoopy:~/Downloads $ grep -m2 -i intel /proc/cpuinfo vendor_id : GenuineIntel model name : Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz josh@snoopy:~/Downloads $ ls -l test -rw-r--r-- 1 josh josh 1073741824 2012-03-07 20:06 test josh@snoopy:~/Downloads $ time gzip test real 0m16.430s user 0m10.210s sys 0m0.490s josh@snoopy:~/Downloads $ time pigz test real 0m5.028s user 0m16.040s sys 0m0.620s </code></pre> Looks good.. although the man page describes it as being "an almost compatible replacement for the gzip program".

评论 #3678361 未加载

评论 #3678349 未加载

isocpprar大约 13 年前

Is xz less resource intensive then bzip2? My testing (admittedly two years ago or so) showed significant differences, better compression ratio with xz but significantly longer and/or more memory used.

评论 #3678259 未加载

PaulHoule大约 13 年前

If you're handling a lot of data it make sense to hash-partition it on some key and spread it out to a large number of files.In that case you might have, say, 512 partitions and you can farm out compression, decompression and other tasks to as many CPUs as you want, even other machines in a cluster.

mappu大约 13 年前

I like to use PPMd (via 7zip) for large volumes of text, but it seems to only be single-threaded, which is a shame. It cuts a good 30% again off the size of the .xml.bz2's that Wikipedia provides.

dhruvbird大约 13 年前

This is awesome since compression in parallel has been largely neglected in practice.

评论 #3678665 未加载

7 条评论

alecco大约 13 年前

评论 #3678434 未加载

评论 #3678969 未加载

bmm6o大约 13 年前

评论 #3678184 未加载

joshbaptiste大约 13 年前

评论 #3678361 未加载

评论 #3678349 未加载

isocpprar大约 13 年前

评论 #3678259 未加载

PaulHoule大约 13 年前

mappu大约 13 年前

I like to use PPMd (via 7zip) for large volumes of text, but it seems to only be single-threaded, which is a shame. It cuts a good 30% again off the size of the .xml.bz2's that Wikipedia provides.

dhruvbird大约 13 年前

This is awesome since compression in parallel has been largely neglected in practice.

评论 #3678665 未加载

Many times faster (de)compression using multiple processors.

7 条评论

Many times faster (de)compression using multiple processors.

7 条评论