Google did the terabyte slightly slower (68 seconds) on 4x fewer machines, but did the petabyte in 6 hours and 2 minutes (around 1/3 of the time of Hadoop) on nearly the same number of machines (4000).
Their report (linked to from the post) goes into greater detail: <a href="http://developer.yahoo.com/blogs/hadoop/Yahoo2009.pdf" rel="nofollow">http://developer.yahoo.com/blogs/hadoop/Yahoo2009.pdf</a><p>I'd love to know why the 500 GB and 100 TB sorts ran at about half the speed of the other two (~0.5 TB/min as opposed to ~1 TB/min).