Scala vs. Go TCP Benchmark

139 点作者 rck将近 12 年前

19 条评论

dlsspy将近 12 年前

Did you consider running the go client against the scala server and vice versa?Also, that's kind of a lot of code. Here's my rewrite of the server: <a href="http://play.golang.org/p/hKztKKQf7v" rel="nofollow">http://play.golang.org/p/hKztKKQf7v</a>It doesn't return the exact same result, but since you're not verifying the results, it is effectively the same (4 bytes in, 4 bytes back out). I did slightly better with a hand-crafted one.A little cleanup on the client here: <a href="http://play.golang.org/p/vRNMzBFOs5" rel="nofollow">http://play.golang.org/p/vRNMzBFOs5</a>I'm guessing scala's hiding some magic, though.I made a small change to the way the client is working, buffering reads and writes independently (can be observed later) and I get similar numbers (dropped my local runs from ~12 to .038). This is that version: <a href="http://play.golang.org/p/8fR6-y6EBy" rel="nofollow">http://play.golang.org/p/8fR6-y6EBy</a>Now, I don't know scala, but based on the constraints of the program, these actually all do the same thing. They time how long it takes to write 4 bytes * N and read 4 bytes * N. (my version adds error checking). The go version is reporting a bit more latency going in and out of the stack for individual syscalls.I suspect the scala version isn't even making those, as it likely doesn't need to observe the answers.You just get more options in a lower level language.

评论 #6165382 未加载

est将近 12 年前

> The experiments where performed on a 2.7Ghz quad core MacBook Pro with both client and server running locally, so as to better measure pure processing overhead. The client would make 100 concurrent connections and send a total of 1 million pings to the server, evenly distributed over the connections. We measured the average round trip time.Another let's rape localhost:8080 on a MacBook Pro™ benchmark

评论 #6165245 未加载

评论 #6165381 未加载

评论 #6165150 未加载

评论 #6165296 未加载

bad_user将近 12 年前

So even if Scala is ahead in this (flawed) benchmark, that's not how you write a TCP server in Scala, because you want to do it non-blocking. Not doing it based on asynchronous I/O means that in a real-world scenario the server will choke under the weight of slow connections, not to mention be susceptible to really cheap DoS attacks like Slowloris [1].Seriously, it goes beyond the underlying I/O API that you're using. If anywhere in the code you're reading from an InputStream or you're writing to an OutputStream that's connected to an open socket, then that's a blocking call that can crush your server. Right now, every Java Servlets container that's not compatible with the latest Servlets API 3.1 can be brought down with Slowloris, even if under the hood they are using NIO.Option A for writing a server in Scala is Netty [2].Option B for writing a server in Scala is the new I/O layer in Akka [3].[1] <a href="http://ha.ckers.org/slowloris/" rel="nofollow">http://ha.ckers.org/slowloris/</a>[2] <a href="http://netty.io/" rel="nofollow">http://netty.io/</a>[3] <a href="http://doc.akka.io/docs/akka/snapshot/scala/io.html" rel="nofollow">http://doc.akka.io/docs/akka/snapshot/scala/io.html</a>

评论 #6167128 未加载

voidlogic将近 12 年前

>The experiments where performed on a 2.7Ghz quad core MacBook Pro with both client and server running locallyNo no no. Assuming your production code runs on Linux THAT is where you need to do this test. It is extremely naive to assume that either the JVM or the Go runtime will perform system interfacing tasks even remotely similar between OSX and Linux. Linux is what you will use in production. Linux is almost always faster (more effort from devs both on the kernel TCP/IP side and the runtime/userspace side).Write your Scala, Java, Go wherever you want, but please, benchmark it in a clone of your production environment!P.S. In production I assume your client and server will not be local... don't do this, kernels do awesome/dirty optimizations over loop-back interfaces, sometimes even bypassing large parts of the TCP/IP stack, parts you want included in any meaningful benchmark.

jaekwon将近 12 年前

Results from my Mac:<pre><code> Go server vs Go client: 10ms Scala server vs Scala client: 3ms Go server vs Scala client: 4ms Scala server vs Go client: ???? </code></pre> Scala server against the Go client is really slow (?). I reduced the ping count by a factor of 100, and extrapolating I think it would have reported around 670ms. What gives?I don't know much about Scala Futures, but isn't Scala's client doing something completely different than the Go client? Scala's client with Future.sequence looks like it's calling each `ping` method sequentially.Printing the connection identifier {0,100} on open & close shows that while it isn't completely sequential, only about a handful of connections are open at a time.On the other hand, the Go client appears to switch amongst goroutines more frequently. All the connections open before any connection closes.In other words, the I think the difference in performance is due to the difference in how randomly the connections are shuffled. The terrible performance time in the last case I think shows a bottleneck in the Scala server rather than the Go client.

damian2000将近 12 年前

Isn't this more like a comparison of the JVM's TCP library vs Go's TCP library... not so much Scala vs Go?

评论 #6165149 未加载

aaron42net将近 12 年前

If the eventual production app would run on Linux (which I'm only guessing based on the context), this benchmark should probably be run there. Darwin's surprisingly higher system call and context switch overhead can be deceptive for apps that are OS-bound.

smegel将近 12 年前

Quick look at the language benchmark games, and Go is not 10x slower than Java for most tests. Java is often 3-5x slower than C/C++/etc (although maybe it didn't get enough time to warm up).

评论 #6166288 未加载

geal将近 12 年前

Ok, so. When you need to write a load balancer and want to test different languages for the task, you don't do a benchmark like that one.Writing a "ping-pong"? And not using the same client to test both servers?It would not have been too hard to write a simple proxy in both languages. Not even worrying about parsing HTTP headers, just testing TCP, that is really easy.Now, if you really want to test the performance, you have to implement it differently. Just two small features you would need in both:* non blocking IO: right now, you're starting a new future in Scala, that's easy to write but not really efficient (it might work better with goroutines) * zero copy: if you're load balancing, you will spend your time moving bits, so you'd better make sure that you don't copy them too much. It is possible with Scala, but it looks like Go does not support itNow, when you have reasonable testing grounds (that woudn't be more that a hundred lines in both languages), better get your statistics right."The client would make 100 concurrent connections and send a total of 1 million pings to the server, evenly distributed over the connections. We measured the average round trip time" -> that is NOT how you should test. Here, you would want to know what is the nominal pressure the balancer could handle, so you must measure a lot of metrics:* RTT * bandwidth (per user and total) * time to warmup (the JVM optimizes a lot of things on the fly, you have to wait for it) * operational limits (what is the maximal bandwidth for which the performance crashes? same for number of users)And then, you don't measure only the average values. You must measure the standard deviation. Because you could have a good average, but wildly varying data, and that is not good.Last thing: the macbook may not be a good testing system.

Oculus将近 12 年前

What would be the advantages of using a custom built load balancer vs. something off the shelf like Nginx?

willvarfar将近 12 年前

The benefit of go is the cheap threads and CSP which make it scale well for complex servers.I think we'll see the Go runtime gaining Single System Image distribution, performance improvements and libraries not services (e.g. groupcache) making it a very different world to develop in than Scala.

评论 #6165586 未加载

dschiptsov将近 12 年前

... that the Go server had a memory footprint of only about 10 MB vs. Scala’s nearly 200 MB. Priceless!200Mb for such a crappy server with only hundred connections.So, it is not about whose wrapper around epoll is thinner, but about how the data are represented and copied.

评论 #6165379 未加载

评论 #6166198 未加载

shin_lao将近 12 年前

I think the JVM TCP stack is extremely mature, but I'm really surprised to see that Go is ten times slower.It would have been interesting to have a C/C++ benchmark for reference.

评论 #6165092 未加载

评论 #6165082 未加载

评论 #6165087 未加载

madisp将近 12 年前

"The actual test code contained some functionality to deal with connection errors, omitted here for brevity."Any way to see the actual test code?

msie将近 12 年前

Ugh, after reading all the comments here I wonder how a mere-mortal programmer gets multi-threaded, network programming done right. It's not clear to me if there is a clear winner between Go and Scala/JVM. Are the majority of programs out there crappy, memory-hogging and non-performant?EDIT: Any good references out there? Thanks!

luikore将近 12 年前

So this Go server is slower than a single threaded, blocking TCP server in Ruby. And the memory? Almost the same:require "socket" s = TCPServer.new 'localhost', 1201 loop do c = s.accept if c.read(4) c << 'Pong' end c.close end

评论 #6165277 未加载

评论 #6165590 未加载

anuraj将近 12 年前

It is the JVM rather than Scala - one of the fastest VMs implemented till date.

corresation将近 12 年前

As a word of advice -- you are almost certainly wasting your time writing a load balancer. There are close to zero cases where someone can legitimately justify such an exercise.In any case, your benchmark is flawed (as virtually all benchmarks are). The reason Go the client is slower is because it tries to pervasively "thread" via the M:N scheduler -- every wait check causes it to yield the actual thread and switch goroutine, creating a large amount of overhead. The Scala cases, on the other hand, is dramatically more limited and will not yield this overhead.The Go server does not have this fault, and is likely top performance. And aren't we talking about a server anyways?Now as to the client, while we could naively criticize M:N scheduling based upon this, try giving it a more realistic workload (unless you seriously plan on load balancing pongs): Instead of ping/pong, return larger lengths of data preferably over actual network connections (not localhost) - e.g. 32KB.The Go client will catch up if not shoot into the lead. M:N scheduling is optimal for most real-world workloads, though it is less optimal for spin-off-a-million-goroutines that do nothing type tests.This is not a test of TCP overhead, or a realistic test, but instead demonstrates the small overhead of goroutines when you give each a minuscule amount of wait work.

评论 #6166474 未加载

cmccabe将近 12 年前

EDIT. OK, so. I ran this benchmark myself on an 8-core Xeon running Linux. 2.13 GHz, CentOS 6.2. Kernel was 2.6.32-220.el6.x86_64. 50 gigs of RAM.I got somewhere between 2.0 and 2.2 "milliseconds per ping" for Scala 2.9, and somewhere between 3.5 and 3.7 for Go 1.1. This is not the 10x difference that the authors reported, but it is something. The difference may be due in part to the different platform and hardware I am using.Contrary to what I wrote earlier, I noticed that GOMAXPROCS=8 did seem to be slower than GOMAXPROCS=4 here. I got around 4 "milliseconds per ping" with GOMAXPROCS=8. Using a mutex and explicit condition variable shaved off maybe 0.2 milliseconds on average (very rough estimation).Again contrary to what I wrote earlier, Nagle on versus off didn't seem to matter in the Go code. I still think you should always have it off for a test like this, but on my setup I did not see a difference.I still don't think this benchmark is showing what they think it is. I have a hunch that this is more of a scheduler benchmark than a TCP benchmark at all. I think I'd have to haul out vtune to get any further, and I'm getting kind of tired (after midnight here).

评论 #6165205 未加载

评论 #6165415 未加载

评论 #6165917 未加载