Zed Shaw: "poll, epoll, science, and superpoll" with R

227 pointsby tdmackeyalmost 15 years ago

15 comments

jacquesmalmost 15 years ago

In real-life web serving situations, and not in benchmarks, the majority of the fds is not active. It's the slow guys that kill you.A client on a fast connection will come in and will pull the data as fast as the server can spit it out, keeping the process and the buffers occupied for the minimum amount of wall clock time and the number of times the 'poll' cycle is done is very small.But the slowpokes, the ones on dial up and on congested lines will get you every time. They keep the processes busy far longer than you'd want and you have to hit the 'poll' cycle far more frequently, first to see if they've finally completed sending you a request, then to see if they've finally received the last little bit of data that you sent them.The impact of this is very easy to underestimate, and if you're benchmarking web servers for real world conditions you could do a lot worse than to run a test across a line that is congested on purpose.

评论 #1572321 未加载

评论 #1571206 未加载

评论 #1570904 未加载

评论 #1571480 未加载

FooBarWidgetalmost 15 years ago

Zed isn't the only one who has found epoll to be slower than poll. The author of libev basically says the same thing. See <a href="http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod" rel="nofollow">http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod</a> and search for EVBACKEND_EPOLL.I wonder how kqueue behaves compares to poll and epoll. Kqueue has a less stupid interface because it allows you to perform batch updates with a single syscall.

jfageralmost 15 years ago

It is worth pointing out that the original epoll benchmarks were focused on how performance scaled with the number of dead connections, not performance in general:<a href="http://www.xmailserver.org/linux-patches/nio-improve.html" rel="nofollow">http://www.xmailserver.org/linux-patches/nio-improve.html</a>And as jacquesm points out, in a web-facing server, that's the case you should care about. A 15-20% performance hit in a situation a web-facing server is never going to see doesn't matter when you consider that the 'faster' method is 80% slower (or worse) in lots of real world scenarios.I'll be interested to see how the superpoll approach ends up working, but my first impression is 'more complexity, not much more benefit'.

评论 #1572292 未加载

pmjordanalmost 15 years ago

Pardon my ignorance, I haven't built high performance servers at this low a level, but I'm intrigued:What exactly is the definition of an "active" file descriptor in this context?My best guess after reading the man pages is that poll() takes an array of file descriptors to monitor and sets flags in the relevant array entries, which your code then needs to scan linearly for changes, whereas epoll_wait() gives you an array of events, thus avoiding checking file descriptors which haven't received any events. Active file descriptors would therefore be those that did indeed receive an event during the call.EDIT: thanks for pointing out Zed's "superpoll" idea. I somehow completely missed that paragraph in the article, which makes the following paragraph redundant.If this is correct, it sounds to me (naive as I am) as if some kind of hybrid approach would be the most efficient: stuff the idling/lagging connections into an epoll pool and add the pool's file descriptor to the array of "live" connections you use with poll(). That of course assumes you can identify a set of fds which are indeed most active.

评论 #1570800 未加载

评论 #1572363 未加载

评论 #1570799 未加载

axodalmost 15 years ago

Sounds like premature optimization to me. Is this really the bottleneck? Is the extra complexity and logic really going to be a net win?

评论 #1570877 未加载

评论 #1570857 未加载

评论 #1570944 未加载

评论 #1570924 未加载

frognibblealmost 15 years ago

The blog post does not say if the epoll code uses level triggering or edge triggering. It would be interesting to see the results for both modes. The smaller number of system calls required for edge triggering might make a difference in performance.

评论 #1572540 未加载

KirinDavealmost 15 years ago

Is it just me, or did Zed not describe his testing methodology in any detail?I can't even find a reference to his OS configuration and version details that he's developing on, which seems to me like a critical detail.

评论 #1573210 未加载

kunleyalmost 15 years ago

Cool experiment Mr Zed, but what about kqueue?It seems superior to both *poll minions. Would be great if you proved/falsified this thesis as well.

评论 #1571325 未加载

评论 #1572555 未加载

kqueuealmost 15 years ago

Lets assume we have 20k opened FDs.In case of poll(), you have to transfer this array of FDs from the userland vm to the kernel vm each time you call poll(). Now compare this with epoll() (let's assume we are using EPOLLET trigger), when you only have to transfer the file descriptors once.You might say the copying won't matter, but it will matter when you have a lot of events coming on the 20k FDs which eventually leads to calling xpoll() at a higher rate, hence more copying of data between the userland and kernel (4bytes * 20k, ~80kbytes each call).

评论 #1572549 未加载

评论 #1571945 未加载

phintjensalmost 15 years ago

Zed, whats with all the premature optimization? Surely Mongrel2 should first be able to make coffee, build you an island and f@!in transform into a jet and fly you there, before you start to make it faster!Just kidding. It's always nice to see science in action. Great work! I suspect there's an impact on ZeroMQ's own poll/epoll strategy.

jaekwonalmost 15 years ago

0.6 is so arbitrary. it should be 1.0/golden-ratio.

评论 #1572561 未加载

pphaneufalmost 15 years ago

Question: as the ATR is going higher, so would the proportional time spent in poll or epoll, no?So if you have a thousand fds, and they're all active, you have to deal with a thousand fds, which would make the difference between poll and epoll insignificant (only twice as fast, not even an order of magnitude!)?This would make the micro-benchmark quite micro! Annoyingly enough, I think that means that the real way to find out would be an httpperf run, with each backends. A lot more work...

16salmost 15 years ago

Very nice write-up. Little details such as this should make Mongrel2 very solid. It's nice to see how he analyzed the issues around poll and epoll and then figured out how to make use of both for optimum performance no matter what happens in production. Many other programs could benefit from this sort of analysis although at different levels... e.g. Sorted vectors may be better for smaller containers but hash tables better for larger containers, etc.

lukesandbergalmost 15 years ago

interesting article! Is 'super-poll' done yet? i would have liked to see a super poll line on some of those graphs to see how it compares to just vanilla poll and ePoll at different ATRs. Though i guess you would also have to test for situations where ATR varies over time (so that you could measure the impact of moving fds back and forth).

c00p3ralmost 15 years ago

It is a little wonder why this kind of people think that everyone else are just stupid to realize such things. What they want is a fame and followers. (btw, don't you forget to donate!)hint: nginx/src/event/modules/ngx_epoll_module.cMay be one should learn how to use epoll and, perhaps, how to program? ^_^