Different anecdote, but similar vibe....<p>In ~2010, I was benchmarking Solarflare (Xilinx/AMD now) cards and their OpenOnload kernel-bypass network stack. The results showed that two well-tuned systems could communicate faster (lower latency) than two CPU sockets within the same server that had to wait for the kernel to get involved (standard network stack). It was really illuminating and I started re-architecting based on that result.<p>Backing out some of that history... in ~2008, we started using FPGAs to handle specific network loads (US equity market data). It was exotic and a lot of work, but it significantly benefited that use case, both because of DMA to user-land and its filtering capabilities.<p>At that time our network was all 1 Gigabit. Soon thereafter, exchanges started offering 10G handoffs, so we upgraded our entire infrastructure to 10G cut-through switches (Arista) and 10G NICs (Myricom). This performed much better than the 1G FPGA and dramatically improved our entire infrastructure.<p>We then ported our market data feed handler's to Myricom's user-space network stack, because loads were continually increasing and the trading world was continually more competitive... and again we had a narrow solution (this time in software) to a challenging problem.<p>Then about a year later, Solarflare and it's kernel-compatible OpenOnload arrived and we could then apply the power of kernel bypass to our entire infrastructure.<p>After that, the industry returned to FPGAs again with 10G PHY and tons of space to put whole strategies... although I was never involved with that next generation of trading tech.<p>I personally stayed with OpenOnload for all sorts of workloads, growing to use it with containerization and web stacks (Redis, Nginx). Nowadays you can use OpenOnload with XDP; again a narrow technology grows to fit broad applicability.
Since apparently no one is willing to read this excellent article, which even comes with fun sliders and charts...<p>> It turns out that Chrome actively throttles requests, including those to cached resources, to reduce I/O contention. This generally improves performance, but will mean that pages with a large number of cached resources will see a slower retrieval time for each resource.
Worth noting that around the end of 2020, Chrome and Firefox enabled cache partitioning by eTLD+1 to prevent scripts from gaining info from a shared cache. This kills the expected high hit rate from CDN-hosted libraries. <a href="https://developer.chrome.com/blog/http-cache-partitioning/" rel="nofollow">https://developer.chrome.com/blog/http-cache-partitioning/</a> <a href="https://developer.mozilla.org/en-US/docs/Web/Privacy/State_Partitioning" rel="nofollow">https://developer.mozilla.org/en-US/docs/Web/Privacy/State_P...</a>
> This seemed odd to me, surely using a cached response would always be faster than making the whole request again! Well it turns out that in some cases, the network is faster than the cache.<p>Did I miss a follow up on this, or did it remain unanswered as to what the benefit of racing against the network is?<p>The post basically says that sometimes the cache is slower because of throttling or bugs, but mostly bugs.<p>Why is Firefox sending an extra request instead of figuring out what is slowing down the cache? It seems like an overly expensive mitigation…
Well, yeah. Disk cache can take hundreds of MS to retrieve, even on modern SSDs. I had a handful of oddly heated discussions with an architect about this exact thing at my previous job. Showing him the network tab did not because he had read articles and was well informed about these things.
> Concatenating / bundling your assets is probably still a good practice, even on H/2 connections. Obviously this comes on balance with cache eviction costs and splitting pre- and post-load bundles.<p>I guess this latter part refers to the trade-off between compiling all your assets into a single file, and then requiring clients to re-download the entire bundle if you change a single CSS color. The other extreme is to not bundle anything (which, I gather from the article, is the standard practice since all major browsers support HTTP/2) but this leads to the described issue.<p>What about aggressively bundling, but also keeping track at compile time of diffs between historical bundles and the new bundle? Re-connecting clients could grab a manifest that names the newest mega-bundle as well as a map from historical versions to the patches needed to bring them up to date. A lot more work on the server side but maybe it could be a good compromise?<p>Of course that's the easy version, but it has a huge flaw which is all first-time clients have to download the entire huge mega bundle before the browser can render anything, so to make it workable it would have to compile things into a few bootstrap stages instead of a single mega-bundle.<p>I am <i>clearly</i> not a frontend dev. If you're going to throw tomatoes please also tell me why ;)<p>* edit: found the repo that made me think of this idea, <a href="https://github.com/msolo/msolo/blob/master/vfl/vfl.py" rel="nofollow">https://github.com/msolo/msolo/blob/master/vfl/vfl.py</a> but it's from antiquity and probably predates node and babel/webpack, but the idea is you name individual asset versions with a SHA or tag and let them be cached forever, and to update the web app you just change a reference in the root to make clients download a different resource dependency tree root (and they re-use unchanged ones) *
<a href="https://simonhearne.com/2020/network-faster-than-cache/#by-operating-system" rel="nofollow">https://simonhearne.com/2020/network-faster-than-cache/#by-o...</a><p>When I first saw this, I was confused why Linux (and Ubuntu and Debian) performed so poorly. But then I asked myself; is the data even representable? Or is Linux worse because users on Linux tend to stick to lower end hardware, since the measurenents were, AFAIR, taken for granted from website visitors instead of being tested on same hardware with different operating systems installed?<p>I don't know, can someone explain the discrepancy?<p>Is the raw data available? What is the number of requests that were included in the measurements for Debian (that doesn't hit cache in the first 5 ms)?<p>Thanks!
I wonder if browsers could design a heuristic to cache multiple items in one cache entry -- that is, instead of the website doing file concatenation, the browser dynamically decides to concatenate some resources in order to reduce the number of asks to the local cache. For example, the browser would know on the initial load which requests get made together (such as three js files or four images), so they could get batched into a single cache entry
So is the takeaway that data in the RAM of some server connected by fast network is sometimes "closer" in retrieval time than that same data on a local SSD?
Sun Microsystems stated:
"The network is the computer"
<a href="https://en.m.wikipedia.org/wiki/The_Network_is_the_Computer" rel="nofollow">https://en.m.wikipedia.org/wiki/The_Network_is_the_Computer</a><p>Sun was right.
Much confusion in the comments.<p>Tl;dr: cache is "slow" because the number of ongoing requests-- including to cache! --are throttled by the browser. I.e the cache isn't slow, but reading it is waited upon and the network request might be ahead in the queue.
Never knew caches could take longer to retrieve. Always thought it'd be near instant.<p>No good solutions either, weird throttling isn't feasible when creating Web apps.
For me this is very noticeable whenever I open a new Chrome tab. It takes 3+ seconds for the icons of the recently visited sites to appear, whatever cache is used for the favicons is extremely slow. Thankfully the disk cache for other resources runs at a more normal speed.