TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

A decade of major cache incidents at Twitter

138 pointsby Smerityover 3 years ago

8 comments

jstyover 3 years ago
Major incidents aside, I always think that cache-related bugs are some of the most likely to go undetected since if you don&#x27;t test for them end-to-end, they&#x27;re really not that easy to spot &amp; diagnose.<p>An article sticking around too long on the home page. Semi-stale data creeping into your pipeline. Someone&#x27;s security token being accepted post-revocation. All really hard to spot unless (1) you&#x27;re explicitly looking, or (2) manure hits the fan.
评论 #30302583 未加载
评论 #30301171 未加载
teromover 3 years ago
Required reading for all of the &quot;I could code up Twitter in a weekend&quot; -types.<p>The long listen queue -&gt; multiple queued up retries feedback loop is a classic: <a href="https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;html&#x2F;rfc896" rel="nofollow">https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;html&#x2F;rfc896</a> TCP&#x2F;IP &quot;congestion collapse&quot; and the 1986 Internet meltdown [various sources]
评论 #30300751 未加载
评论 #30301192 未加载
Smerityover 3 years ago
What I find most interesting in this is the pseudo detective story of hunting down disappearing post-mortem and &quot;lessons learned&quot; documentation. Optimistically we&#x27;d hope that perhaps the older systems no longer reflect the existing systems in any meaningful way (possibly as the org structures and&#x2F;or software stacks shift and change) and they&#x27;re no longer relevant.<p>I&#x27;d imagine most lost knowledge is not an explicit decision however which means such historical scenarios &#x2F; documentation &#x2F; ... are just lost as part of business. Lost knowledge is the default for companies.<p>Twitter is likely better than most given their documentation is all digital and there exist explicit processes to catalogue such incidents. I&#x27;d also be curious to see how much of this knowledge has been implicitly exported to their open source codebases.
评论 #30317350 未加载
plasmaover 3 years ago
I remember reading Facebooks caches had a dedicated standby set of “gutter” servers that would take over a failure quickly (otherwise inactive and unused) that was an interesting mitigation for some failure scenarios.
Jachover 3 years ago
These big incidents involving &#x27;big cache&#x27; are fun to read about. Years ago I had to deal with a bunch of cache issues over a short time, but they were all minor incidents with minor uses of cache (simple memoization, storing stuff in maps on attributes of java singletons, browser local storage). Still, I made a checklist of questions to ask thenceforth on any proposal or implementation of a cache in a doc or code review. A bunch of them are just focused on actually paying attention to what your keys are made of and how invalidation works (or if you even can invalidate, or if it&#x27;s even needed). I think for &#x27;big cache&#x27; questions I should just refer to this blog post and ask &quot;what&#x27;s the risk of these issues?&quot;
wizwit999over 3 years ago
Yeah, see also, Marc Brooker has a good article on why the bimodal behavior of caches can cause a lot of headaches <a href="https:&#x2F;&#x2F;brooker.co.za&#x2F;blog&#x2F;2021&#x2F;08&#x2F;27&#x2F;caches.html" rel="nofollow">https:&#x2F;&#x2F;brooker.co.za&#x2F;blog&#x2F;2021&#x2F;08&#x2F;27&#x2F;caches.html</a>
mprovostover 3 years ago
&quot;There are only two hard things in Computer Science: cache invalidation and naming things.&quot; -- Phil Karlton<p><a href="https:&#x2F;&#x2F;martinfowler.com&#x2F;bliki&#x2F;TwoHardThings.html" rel="nofollow">https:&#x2F;&#x2F;martinfowler.com&#x2F;bliki&#x2F;TwoHardThings.html</a>
评论 #30300705 未加载
spoonjimover 3 years ago
“ On Nov 8, a user changed their name from tigertwo to Woflstar_Bachi.”<p>Horrifically inappropriate inclusion of PII in this post. Didn’t someone at legal go through this?
评论 #30301886 未加载