If you're handling a lot of data it make sense to hash-partition it on some key and spread it out to a large number of files.<p>In that case you might have, say, 512 partitions and you can farm out compression, decompression and other tasks to as many CPUs as you want, even other machines in a cluster.