40 Milliseconds of latency that just would not go away

386 pointsby r4umover 4 years ago

31 comments

Was delayed ACKs the problem? Disabling delayed ACKs seems like a better bet than using TCP_NODELAY which turns off Nagle's algorithm.

评论 #24798689 未加载

评论 #24798364 未加载

评论 #24798887 未加载

punnerudover 4 years ago

From John Nagle:"(..) Unfortunately, delayed ACKs went in after I got out of networking in 1986, and this was never fixed. Now it's too late."<a href="https://stackoverflow.com/a/16663206/2326672" rel="nofollow">https://stackoverflow.com/a/16663206/2326672</a>

评论 #24798550 未加载

评论 #24798784 未加载

评论 #24799256 未加载

sdoeringover 4 years ago

> Down that path also lies madness.Ohhhhh so true. I sadly have no such story to tell regarding performance optimization, but figuring out the intricacies of any complex system (for me at least) inevitably leads to you knowing arcane stuff that might come in handy some time. But on the other hand, it also - in my humble experience - leads to you knowing a lot of arcane stuff that might have an impact on a problem, but is absolutely not related in the specific case one is dealing with.Knowing when to discard arcane knowledge and when to jump onto that train of thought imho is crucial.But on the other hand debugging arcane stuff in complex systems is just so much fun. One learns so much.

评论 #24799835 未加载

jlokierover 4 years ago

This is one ancient problem. I remember dealing with it in 2003.Writeup from 1997 here (P-HTTP basically means HTTP version 1.1):<a href="https://www.isi.edu/~johnh/PAPERS/Heidemann97a.html" rel="nofollow">https://www.isi.edu/~johnh/PAPERS/Heidemann97a.html</a>> John Heidemann. Performance Interactions Between P-HTTP and TCP Implementations. ACM Computer Communication Review. 27, 2 (Apr. 1997), 65–73.> This document describes several performance problems resulting from interactions between implementations of persistent-HTTP (P-HTTP) and TCP. Two of these problems tie P-HTTP performance to TCP delayed-acknowledgments, thus adding up to 200ms to each P-HTTP transaction. A third results in multiple slow-starts per TCP connection. Unresolved, these problems result in P-HTTP transactions which are 14 times slower than standard HTTP and 20 times slower than potential P-HTTP over a 10 Mb/s Ethernet. We describe each problem and potential solutions. After implementing our solutions to two of the problems, we observe that P-HTTP performs better than HTTP on a local Ethernet. Although we observed these problems in specific implementations of HTTP and TCP (Apache-1.1b4 and SunOS 4.1.3, respectively), we believe that these problems occur more widely.Solutions for efficient batching of HTTP headers + data without delays involve TCP_NODELAY, and MSG_MORE / SPLICE_F_MORE / TCP_CORK / TCP_NOPUSH. Possibly TCP_QUICKACK may come in handy. Same for any protocol really, but HTTP is the one where there tends to be a separate sendmsg() and sendfile() on Linux.

rsclientover 4 years ago

This is exactly why the Socket API in WinRT has Nagle off by default. The old way of dealing with sockets was to treat them like buffered files, or to drive them from a keyboard (so that Nagle is useful). But newer socket programs seem to just make a full chunk of information, and send it at once. Those newer programs either turn off Nagle, or would be improved if they did.So we bit the bullet, and decided to make Nagle off by default.

评论 #24798446 未加载

dekhnover 4 years ago

I once had to debug the scaling performance of a MPI-based simulation algorithm on cheap linux machines with TCP. I finally collected a TCP trace and showed it to the local expert who said: "hmm, 250ms delay right there.. that's the TCP retransmit timer... you're flooding the ethernet switch with too many packets and the switch is dropping them. Enable <such and such a feature>."Since then I've always kept various constants in human RAM because it helps root cause.

评论 #24801300 未加载

scott_sover 4 years ago

John Nagle is a commenter here on HN, and has commented on this very thing: <a href="https://news.ycombinator.com/item?id=10608356" rel="nofollow">https://news.ycombinator.com/item?id=10608356</a>I have also ran into this, but for me it was a periodic latency spike with steady but periodic messages. That latency spike went away when the messages were sent as-fast-as-possible.

评论 #24802028 未加载

JoeAltmaierover 4 years ago

Similar to Nagle, there are reasons to combine packets on a session. Network equipment that fools with every packet can get backed up if the traffic packet count exceeds a limit. By Nagling (or doing something similar in your transmit code) you can increase your message rate through such bottlenecks.Used to have a server cluster that used some 'hologram' style router on the receiving end, to spread load. It had a hard limit on # packets per second it could handle. I changed our app to combine sends (2ms timer, not 40ms!) and halved our total traffic packet count. Put off the day they had to buy more server-side hardware to handle the load.Btw if the clients are on wifi networks, then there's no point in aggregating sends past a pretty small size (512 bytes?) because wifi fragments (used to fragment?) packets to that smaller size over the air, and never reassembles them, leaving that to the target server.

评论 #24801417 未加载

unilynxover 4 years ago

> Stuff like this just proves that a large part of this job is just remembering a bunch of weird data points and knowing when to match this story to that problem.I've hit Nagle far in the past, and reading the title I thought 'well that can't be about Nagle because that was a 200ms delay'Looks like someone tuned it down to 40ms but didn't dare removing it. It would be interesting to know how they came to that choice

评论 #24800854 未加载

euph0riaover 4 years ago

Why not just use tcpdump or wireshark when troubleshooting network latencies? Usually only takes a minute or two to pinpoint the issue. Then you would need to spend time understanding why the pinpointed behavior is what it is and sometimes it is in the application, sometimes not.. I've solved so many issues over the years with tcpdump that it has become one of the most valuable tools I know.

评论 #24799658 未加载

评论 #24804259 未加载

rgossiauxover 4 years ago

As soon as I saw the title I remembered this Julia Evans post about the same issue: <a href="https://jvns.ca/blog/2015/11/21/why-you-should-understand-a-little-about-tcp/" rel="nofollow">https://jvns.ca/blog/2015/11/21/why-you-should-understand-a-...</a>

errantsparkover 4 years ago

I remember first learning of Nagle's algorithm back in the early WoW days in my endless quest to get lower latency for PvP on my neighbor's cracked WEP. I don't really know if it matters much in 2020, but I still habitually run the *.reg file to disable it on every new windows install.

评论 #24801366 未加载

nh2over 4 years ago

For those wondering "so, so how do I do it right?":I was in that situation 4 years ago and did a short write up on it:<a href="https://gist.github.com/nh2/9def4bcf32891a336485" rel="nofollow">https://gist.github.com/nh2/9def4bcf32891a336485</a>It explains how to avoid the 40ms delay and still batch data where possible for maximum efficiency. The key part is that you can toggle the TCP options during the lifetime of the connection to force flushes.Review appreciated.

评论 #24802272 未加载

lmilcinover 4 years ago

To be fair, this can be fixed with well designed libraries that don't rely on TCP doing job for them merging buffers and preventing small writes.The issue is vast majority of libraries treat the problem as if it did not exist and prefer to not get their hands dirty and just conveniently write a stream of data to the socket leaving to the user to correctly configure options on the socket.But yes, in general, performance is at least in significant part about remembering a huge amount of trivia.

评论 #24798763 未加载

评论 #24799313 未加载

lawrjoneover 4 years ago

It’s one of those trapdoors that people continually fall down: <a href="https://gocardless.com/blog/in-search-of-performance-how-we-shaved-200ms-off-every-post-request/" rel="nofollow">https://gocardless.com/blog/in-search-of-performance-how-we-...</a>

评论 #24799563 未加载

taneqover 4 years ago

World of Warcraft had Nagle's algorithm enabled for YEARS. That's one reason that VPN services were so popular and could cut 50-100ms off your ping time, especially if you were playing from Oceania.

jdblairover 4 years ago

This isn't explicitly related, but interesting, so I offer it up here.When I read 40ms, it triggered a memory from tracking down a different 40ms latency bug a few years ago. I work on the Netflix app for set top boxes, and a particular pay TV company had a box based on AOSP L. Testing discovered that after enough times switching the app between foreground and background, playback would start to stutter. The vendor doing the integration blamed Netflix - they showed that in the stutter case, the Netflix app was not feeding video data quickly enough for playback. They stopped their analysis at this point, since as far as they were concerned, they had found the issue and we had to fix the Netflix app.I doubted the app was the issue, as it ran on millions of other devices without showing this behavior. I instrumented the code and measured 40ms of extra delay from the thread scheduler. The 40ms was there, and was outside of our app's context. Literally, I measured it between the return of the thread handler and the next time the handler was called. So I responded, to paraphrase, its not us, its you. Your Android scheduler is broken.But the onus was on me to prove it by finding the bug. I read the Android code, and learned Android threads are a userspace construct - the Android scheduler uses epoll() as a timer and calls your thread handler based on priority level. I thought, epoll() performance isn't guaranteed, maybe something obscure changed, and this change is adding an additional 40ms in this particular case. So I dove into the kernel, thinking the issue must be somewhere inside epoll().Lucky for me, another engineer, working for a different vendor on the project, found the smoking gun in this patch in Android M (the next version). It was right there, an extra 40ms explicitly (and mistakenly) added when a thread is created while the app is in the background.<a href="https://android.googlesource.com/platform/system/core/+/4cdce42%5E%21/#F0" rel="nofollow">https://android.googlesource.com/platform/system/core/+/4cdc...</a><pre><code> Fix janky navbar ripples -- incorrect timerslack values If a thread is created while the parent thread is "Background", then the default timerslack value gets set to the current timerslack value of the parent (40ms). The default value is used when transitioning to "Foreground" -- so the effect is that the timerslack value becomes 40ms regardless of foreground/background. This does occur intermittently for systemui when creating its render thread (pretty often on hammerhead and has been seen on shamu). If this occurs, then some systemui animations like navbar ripples can wait for up to 40ms to draw a frame when they intended to wait 3ms -- jank. This fix is to explicitly set the foreground timerslack to 50us. A consequence of setting timerslack behind the process' back is that any custom values for timerslack get lost whenever the thread has transition between fg/bg. --- a/libcutils/sched_policy.c +++ b/libcutils/sched_policy.c @@ -50,6 +50,7 @@ // timer slack value in nS enforced when the thread moves to background #define TIMER_SLACK_BG 40000000 +#define TIMER_SLACK_FG 50000 static pthread_once_t the_once = PTHREAD_ONCE_INIT; @@ -356,7 +357,8 @@ &param); } - prctl(PR_SET_TIMERSLACK_PID, policy == SP_BACKGROUND ? TIMER_SLACK_BG : 0, tid); + prctl(PR_SET_TIMERSLACK_PID, + policy == SP_BACKGROUND ? TIMER_SLACK_BG : TIMER_SLACK_FG, tid); return 0;</code></pre>

评论 #24802025 未加载

nialv7over 4 years ago

My first reaction to the 40ms number is "TCP_NODELAY?".That number is probably craved into my brain now.

评论 #24800328 未加载

Ono-Sendaiover 4 years ago

Related: an old blog post of mine: <a href="http://forwardscattering.org/post/3" rel="nofollow">http://forwardscattering.org/post/3</a> 'Sockets should have a flushHint() API call.'

评论 #24799091 未加载

pronoiacover 4 years ago

I wish the fixes from the first fork were offered back to the main project.

nlyover 4 years ago

If TCP had a header flag to indicate that the next segment was <MSS in size, then the receiver could be a lot smarter about whether it delayed the ACK.

jorangreefover 4 years ago

I was dealing with "Nagle's delay" only yesterday, adding "setsockopt(TCP_NODELAY)" for the alpha version of a new high-performance payments database called TigerBeetle: <a href="https://github.com/coilhq/tiger-beetle" rel="nofollow">https://github.com/coilhq/tiger-beetle</a>

评论 #24800177 未加载

2rsfover 4 years ago

I was challenged with similar problems when doing performance testing trying to determine our new wireless router limits only to find out they are far too low.> you can’t fix TCP problems without understanding TCPTrue, but many problems are not TCP problems and it is not always easy to determine where does your delay is coming from

jgalt212over 4 years ago

> Stuff like this just proves that a large part of this job is just remembering a bunch of weird data points and knowing when to match this story to that problem.Sounds a like interns doing rounds with the chief resident.

kevsimover 4 years ago

Second Nagle-related post I've seen this week! <a href="https://www.kdab.com/there-and-back-again/" rel="nofollow">https://www.kdab.com/there-and-back-again/</a>

brundolfover 4 years ago

> Stuff like this just proves that a large part of this job is just remembering a bunch of weird data points and knowing when to match this story to that problem.

nickdothuttonover 4 years ago

Everyone should spend a day trying to optimise iSCSI traffic.

acvnyover 4 years ago

Not clear what is this about. What delay?

评论 #24799036 未加载

nt2h9uh238hover 4 years ago

I have the same issue with my in-memory nodeJS app on Amazon AWS. Nice story, but how to FIX it?

bborudover 4 years ago

W Richard Stevens wrote some good books on TCP/IP. May I suggest reading them?

bullenover 4 years ago

Can you disable Neagles (enable nodelay) on .js XMLHttpRequest or the newer fetch stuff? Maybe chrome/firefox disables it by default?Edit: This community is stupid/toxic.

评论 #24801075 未加载