When network is faster than browser cache (2020)

331 pointsby harporoederalmost 3 years ago

20 comments

neomantraalmost 3 years ago

Different anecdote, but similar vibe....In ~2010, I was benchmarking Solarflare (Xilinx/AMD now) cards and their OpenOnload kernel-bypass network stack. The results showed that two well-tuned systems could communicate faster (lower latency) than two CPU sockets within the same server that had to wait for the kernel to get involved (standard network stack). It was really illuminating and I started re-architecting based on that result.Backing out some of that history... in ~2008, we started using FPGAs to handle specific network loads (US equity market data). It was exotic and a lot of work, but it significantly benefited that use case, both because of DMA to user-land and its filtering capabilities.At that time our network was all 1 Gigabit. Soon thereafter, exchanges started offering 10G handoffs, so we upgraded our entire infrastructure to 10G cut-through switches (Arista) and 10G NICs (Myricom). This performed much better than the 1G FPGA and dramatically improved our entire infrastructure.We then ported our market data feed handler's to Myricom's user-space network stack, because loads were continually increasing and the trading world was continually more competitive... and again we had a narrow solution (this time in software) to a challenging problem.Then about a year later, Solarflare and it's kernel-compatible OpenOnload arrived and we could then apply the power of kernel bypass to our entire infrastructure.After that, the industry returned to FPGAs again with 10G PHY and tons of space to put whole strategies... although I was never involved with that next generation of trading tech.I personally stayed with OpenOnload for all sorts of workloads, growing to use it with containerization and web stacks (Redis, Nginx). Nowadays you can use OpenOnload with XDP; again a narrow technology grows to fit broad applicability.

评论 #31925937 未加载

评论 #31925637 未加载

评论 #31925865 未加载

评论 #31926599 未加载

评论 #31929065 未加载

评论 #31930039 未加载

评论 #31925888 未加载

staticassertionalmost 3 years ago

Since apparently no one is willing to read this excellent article, which even comes with fun sliders and charts...> It turns out that Chrome actively throttles requests, including those to cached resources, to reduce I/O contention. This generally improves performance, but will mean that pages with a large number of cached resources will see a slower retrieval time for each resource.

评论 #31924493 未加载

评论 #31924312 未加载

cyounkinsalmost 3 years ago

Worth noting that around the end of 2020, Chrome and Firefox enabled cache partitioning by eTLD+1 to prevent scripts from gaining info from a shared cache. This kills the expected high hit rate from CDN-hosted libraries. <a href="https://developer.chrome.com/blog/http-cache-partitioning/" rel="nofollow">https://developer.chrome.com/blog/http-cache-partitioning/</a> <a href="https://developer.mozilla.org/en-US/docs/Web/Privacy/State_Partitioning" rel="nofollow">https://developer.mozilla.org/en-US/docs/Web/Privacy/State_P...</a>

评论 #31942330 未加载

评论 #31927881 未加载

sattoshialmost 3 years ago

> This seemed odd to me, surely using a cached response would always be faster than making the whole request again! Well it turns out that in some cases, the network is faster than the cache.Did I miss a follow up on this, or did it remain unanswered as to what the benefit of racing against the network is?The post basically says that sometimes the cache is slower because of throttling or bugs, but mostly bugs.Why is Firefox sending an extra request instead of figuring out what is slowing down the cache? It seems like an overly expensive mitigation…

评论 #31927628 未加载

oblakalmost 3 years ago

Well, yeah. Disk cache can take hundreds of MS to retrieve, even on modern SSDs. I had a handful of oddly heated discussions with an architect about this exact thing at my previous job. Showing him the network tab did not because he had read articles and was well informed about these things.

评论 #31925064 未加载

评论 #31923885 未加载

评论 #31924182 未加载

评论 #31924197 未加载

评论 #31924541 未加载

dllthomasalmost 3 years ago

... oh, that cache.

评论 #31923778 未加载

philsnowalmost 3 years ago

> Concatenating / bundling your assets is probably still a good practice, even on H/2 connections. Obviously this comes on balance with cache eviction costs and splitting pre- and post-load bundles.I guess this latter part refers to the trade-off between compiling all your assets into a single file, and then requiring clients to re-download the entire bundle if you change a single CSS color. The other extreme is to not bundle anything (which, I gather from the article, is the standard practice since all major browsers support HTTP/2) but this leads to the described issue.What about aggressively bundling, but also keeping track at compile time of diffs between historical bundles and the new bundle? Re-connecting clients could grab a manifest that names the newest mega-bundle as well as a map from historical versions to the patches needed to bring them up to date. A lot more work on the server side but maybe it could be a good compromise?Of course that's the easy version, but it has a huge flaw which is all first-time clients have to download the entire huge mega bundle before the browser can render anything, so to make it workable it would have to compile things into a few bootstrap stages instead of a single mega-bundle.I am clearly not a frontend dev. If you're going to throw tomatoes please also tell me why ;)* edit: found the repo that made me think of this idea, <a href="https://github.com/msolo/msolo/blob/master/vfl/vfl.py" rel="nofollow">https://github.com/msolo/msolo/blob/master/vfl/vfl.py</a> but it's from antiquity and probably predates node and babel/webpack, but the idea is you name individual asset versions with a SHA or tag and let them be cached forever, and to update the web app you just change a reference in the root to make clients download a different resource dependency tree root (and they re-use unchanged ones) *

评论 #31924625 未加载

评论 #31926187 未加载

评论 #31924955 未加载

jesprenjalmost 3 years ago

<a href="https://simonhearne.com/2020/network-faster-than-cache/#by-operating-system" rel="nofollow">https://simonhearne.com/2020/network-faster-than-cache/#by-o...</a>When I first saw this, I was confused why Linux (and Ubuntu and Debian) performed so poorly. But then I asked myself; is the data even representable? Or is Linux worse because users on Linux tend to stick to lower end hardware, since the measurenents were, AFAIR, taken for granted from website visitors instead of being tested on same hardware with different operating systems installed?I don't know, can someone explain the discrepancy?Is the raw data available? What is the number of requests that were included in the measurements for Debian (that doesn't hit cache in the first 5 ms)?Thanks!

kardosalmost 3 years ago

I wonder if browsers could design a heuristic to cache multiple items in one cache entry -- that is, instead of the website doing file concatenation, the browser dynamically decides to concatenate some resources in order to reduce the number of asks to the local cache. For example, the browser would know on the initial load which requests get made together (such as three js files or four images), so they could get batched into a single cache entry

cwoolfealmost 3 years ago

So is the takeaway that data in the RAM of some server connected by fast network is sometimes "closer" in retrieval time than that same data on a local SSD?

评论 #31924420 未加载

评论 #31924180 未加载

TheDudeManalmost 3 years ago

Seems like a badly-designed cache.

评论 #31924144 未加载

评论 #31924190 未加载

btdmasteralmost 3 years ago

The cache, for me, is 100:1 winning the races compared to the network. Are there greatly different results for others in about:networking#rcwn?

acdalmost 3 years ago

Sun Microsystems stated: "The network is the computer" <a href="https://en.m.wikipedia.org/wiki/The_Network_is_the_Computer" rel="nofollow">https://en.m.wikipedia.org/wiki/The_Network_is_the_Computer</a>Sun was right.

kreetxalmost 3 years ago

Much confusion in the comments.Tl;dr: cache is "slow" because the number of ongoing requests-- including to cache! --are throttled by the browser. I.e the cache isn't slow, but reading it is waited upon and the network request might be ahead in the queue.

langsoul-comalmost 3 years ago

Never knew caches could take longer to retrieve. Always thought it'd be near instant.No good solutions either, weird throttling isn't feasible when creating Web apps.

hgazxalmost 3 years ago

When I had a very old and slow hard disk I ran my browsers without disk cache precisely because of this reason.

dblohm7almost 3 years ago

Firefox's HTTP cache races with the network for precisely this reason.

评论 #31926906 未加载

评论 #31929638 未加载

评论 #31926799 未加载

agumonkeyalmost 3 years ago

What's the storage capacity of internet cables ?

评论 #31924319 未加载

评论 #31924293 未加载

评论 #31924283 未加载

gowldalmost 3 years ago

The article shows a lot of data about cache speed, but I don't see a comparison to cacheless network.

r1chalmost 3 years ago

For me this is very noticeable whenever I open a new Chrome tab. It takes 3+ seconds for the icons of the recently visited sites to appear, whatever cache is used for the favicons is extremely slow. Thankfully the disk cache for other resources runs at a more normal speed.