What is the bottleneck here? I am surprised that the CPU utilisation is just 2%, which means, either the disk or the network is the bottleneck. Both these aren't discussed. And of course, one shouldn't forget the fact that if CPU is the bottleneck, we need "true" concurrency and there should be threads on every single core. :-)