This article touches on "Request Coalescing" which is a super important concept - I've also seen this called "dog-pile prevention" in the past.<p>Varnish has this built in - good to see it's easy to configure with NGINX too.<p>One of my favourite caching proxy tricks is to run a cache with a very short timeout, but with dog-pile prevention baked in.<p>This can be amazing for protecting against sudden unexpected traffic spikes. Even a cache timeout of 5 seconds will provide robust protection against tens of thousands of hits per second, because request coalescing/dog-pile prevention will ensure that your CDN host only sends a request to the origin a maximum of once ever five seconds.<p>I've used this on high traffic sites and seen it robustly absorb any amount of unauthenticated (hence no variety on a per-cookie basis) traffic.
Love the level of detail that Fly's articles usually go into.<p>We have a distributed CDN-like feature in the hosted version of our open source search engine [1] - we call it our "Search Delivery Network". It works on the same principles, with the added nuance of also needing to replicate data over high-latency networks between data centers as far apart as Sao Paulo and Mumbai for eg. Brings with it another fun set of challenges to deal with! Hoping to write about it when bandwidth allows.<p>[1] <a href="https://cloud.typesense.org" rel="nofollow">https://cloud.typesense.org</a>
This is cool and informative and Kurt's writing is great:<p>The briny deeps are filled with undersea cables, crying out constantly to nearby ships: "drive through me"! Land isn't much better, as the old networkers shanty goes: "backhoe, backhoe, digging deep — make the backbone go to sleep".
>The term "CDN" ("content delivery network") conjures Google-scale companies managing huge racks of hardware, wrangling hundreds of gigabits per second. But CDNs are just web applications. That's not how we tend to think of them, but that's all they are. You can build a functional CDN on an 8-year-old laptop while you're sitting at a coffee shop.<p>huh yeah never thought about it<p>I blame how CDNs are advertised for the visual disconnect
Years ago I was involved with some high performance delivery of a bunch of newspapers, and we used Squid[1] quite well. One nice thing you could do as well (but it's probably a bit hacky and old school these days) was to "open up" only parts of the web page to be dynamic while the rest was cached (or have different cache rules for different page components)[2]. With some legacy apps (like some CMS') this can hugely improve performance while not sacrificing the dynamic and "fresh looking" parts of the website.<p>[1] <a href="http://www.squid-cache.org/" rel="nofollow">http://www.squid-cache.org/</a>
[2] <a href="https://en.wikipedia.org/wiki/Edge_Side_Includes" rel="nofollow">https://en.wikipedia.org/wiki/Edge_Side_Includes</a>
This is so great. See also <a href="https://fly.io/blog/ssh-and-user-mode-ip-wireguard/" rel="nofollow">https://fly.io/blog/ssh-and-user-mode-ip-wireguard/</a>
As someone who’s mostly clueless about BGP but have a fair grasp of all the other layers mentioned, I’d love to see posts like this going more in depth on it for folks like myself.
Some of the things they miss in the post are Cloudflare uses a customised version or Nginx, same with Fastly for Varnish
(don't know about Netlify and ATS)<p>Out of the box nginx doesn't support HTTP/2 prioritisation so building a CDN with nginx doesn’t mean you're going ti be delivering as good service as Cloudflare<p>Another major challenge with CDNs is peering and private backhaul, if you're not pushing major traffic then your customers aren't going to get the best peering with other carriers / ISPs…
> 3. Be like a game server: Ping a bunch of servers and use the best. Downside: gotta own the client. Upside: doesn't matter, because you don't own the client.<p>"If you can run code on it, you can own it". Your front page could just be a tiny loader js that fires off a fetch() for a zero byte resource to all your mirrors, and then proceeds to load the content from the first responder.
> DNS: Run trick DNS servers that return specific server addresses based on IP geolocation. Downside: the Internet is moving away from geolocatable DNS source addresses. Upside: you can deploy it anywhere without help.<p>Can anyone expand on how/why "the Internet is moving away from geolocatable DNS source addresses"?
It is strange that you put a Time duration in front of CDN ( content delivery network ), because given all the recent incident with Fastly, Akamai and Bunny, I read it as 5 hours Centralised Downtime Network.
Does Nginx still not support cache invalidation? If you setup long TTL, is there a way to remove some files from cache without nuking entire cache and restarting an instance?
I like to blog from the raw origin and not use CDNs because if a blogpost is changed I have to manually purge the CDN cache, which can happen a lot. Also CDNs have the caveat in that if they're down, it can make a page load very slow since it tries to load the asset.
Fly is great and I love reading their blog posts.<p>Just hoping they come back around on CockroachDB-- I feel like it's a match made in heaven for what they're providing.