As part of Spark 2.0, we are introducing some new neat optimizations to make a general engine as efficient as specialized code.<p>I just tried on Spark master branch (i.e. the work-in-progress code for Spark 2.0). It takes about 1.5 secs to sum up 1 billion 64-bit integers using a single thread, and about 1 secs using 2 threads. This was done on my laptop (Early 2015 Macbook Pro 13, 3.1GHz Intel Core i7).<p>We haven't optimized integer sorting yet, so that's probably not going to be super fast, but the aggregation performance has been pretty good.<p><pre><code> scala> val start = System.nanoTime
start: Long = 56832659265590
scala> sqlContext.range(0, 1000L * 1000 * 1000, 1, 2).count()
res8: Long = 1000000000
scala> val end = System.nanoTime
end: Long = 56833605100948
scala> (end - start) / 1000 / 1000
res9: Long = 945
</code></pre>
Part of the time are actually spent analyzing the query plan, optimizing it, and generating bytecode for it. If we run this on 10 billion integers, the time is about 5 secs.