Having transferred petabytes of data in tens of millions of files over the past months let me assure you there's only one tool that you really need: GNU parallel.<p>Whether you copy the individual files with ftp, scp or rsync is largely irrelevant. The network is always your ultimate bottleneck. Using a slower copy-tool just means having to set a slightly higher concurrency in order to max it out.
The primary advantage GridFTP has over simply using tar+netcat for performance is that GridFTP can multiplex transfers over multiple TCP connections. This is helpful as long as the endpoint systems limit the per-connection buffer size to some value less than the bandwidth-delay product (BDP) between them. If you've got to bug sysadmins to get GridFTP set up for you on both endpoints, you might as well just ask them to increase the maximum TCP buffer size to match the BDP.<p>EDIT: Sorry, "multiplex" is not the right word to describe that. It's more like GridFTP "stripes" files across multiple connections; it divides the file into chunks, sends the chunks over parallel connections, and reassembles the file at the destination.
I like the tar+netcat mentioned towards the bottom for LAN transfer. That usually goes much faster than rsync or scp.<p>The reason haven't looked at other tools is because I am doing this intermittently and always reach for the tool already installed on the system.
If you have to regularly transfer large amounts of data over a network, it might be worth looking into a wan optimization product like Riverbed's Steelhead, Silverpeak's VX/NX lines, or Bluecoat Mach 5, or one of the other vendors' solutions.<p>Yeah, you could try and roll it yourself, since really it just comes down to compressing and deduplicating what you send over the wire, but doing that well and also making it simple to use is not a trivial problem. Why reinvent the wheel badly?
This is a good site to visit if you have these kinds of data transfer issues: <a href="http://fasterdata.es.net" rel="nofollow">http://fasterdata.es.net</a>