TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Hunting down the stuck BGP routes

222 pointsby bswinnertonabout 4 years ago

8 comments

teh_klevabout 4 years ago
From the article:<p>&gt; <i>With the current “default free zone” containing around 1,000,000 routes</i><p>Back in ~1998 I was tasked with building a route collector&#x2F;looking glass machine for an internet exchange point (sadly defunct). I remember the day we switched the collector on and acquired &quot;all the routes&quot;, there were ~98,000 of them, you could&#x27;ve knocked me over with a feather. It was like looking into the Total Perspective Vortex. Having been out of that game for many years now I&#x27;d no idea we were up to 1M routes...wow. One of the RIPE conferences I attended back then there was much concern about the rapidly increasing size of the global routing table and whether vendors could build hardware powerful enough to keep up.<p>For anyone interested the route collector was built on FreeBSD (3.0 I think) and Zebra[0].<p>And finally, what cracking blog, especially stuff like this:<p><a href="https:&#x2F;&#x2F;blog.benjojo.co.uk&#x2F;post&#x2F;eve-online-bgp-internet" rel="nofollow">https:&#x2F;&#x2F;blog.benjojo.co.uk&#x2F;post&#x2F;eve-online-bgp-internet</a><p>[0]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;GNU_Zebra" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;GNU_Zebra</a>
评论 #26893275 未加载
评论 #26899326 未加载
评论 #26896310 未加载
navanchauhanabout 4 years ago
This reminds me of when YouTube was down for a lot of the world when Pakistan banned YouTube and one of the country&#x27;s telecom company forgot to switch off their BGP route (if that is what the correct terminology would be).[0] Half as Interesting made a nice YouTube video about it.[1]<p>[0] <a href="https:&#x2F;&#x2F;www.cnet.com&#x2F;news&#x2F;how-pakistan-knocked-youtube-offline-and-how-to-make-sure-it-never-happens-again&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.cnet.com&#x2F;news&#x2F;how-pakistan-knocked-youtube-offli...</a><p>[1] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=K9gnRs33NOk" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=K9gnRs33NOk</a>
评论 #26890022 未加载
anticristiabout 4 years ago
Thinking out loud: When I read the BGP spec, I got the feeling that it was optimized for reduced churn. As the Internet routing table size increased and increase in CPU power of routers was an uncertainty, the architects of the Internet wanted to avoid extra BGP exchanges.<p>However, now it seems like the Internet is facing new challenges and a different trade-off might make sense. Why not add a &quot;valid until&quot; attribute on each route? The originating router would have to re-announce a new route every 24 hours. Failure to propagate the update at any point would automatically withdraw it. Of course, re-announcing 1M routes every day might be a lot, but at this point it feels worth considering.
评论 #26896520 未加载
sigmonsaysabout 4 years ago
Interesting read. Its interesting that this is a big in the specification and not implementation since bgp is so old. We must not hit this case often
评论 #26890448 未加载
vlovich123about 4 years ago
I wonder if a robust consensus algorithm might be a better investment than a timeout. I would imagine there are other bugs in BGP implementations so having a routing table that&#x27;s going to trend towards eventual consistency regardless of the starting point might be a more robust solution than just focusing on this one corner case. Might be a more intrusive change though &amp; hard to get middleware to roll out such a change?
评论 #26896814 未加载
评论 #26894986 未加载
评论 #26896713 未加载
评论 #26893641 未加载
john37386about 4 years ago
Nice article on the basic functionalities of the Internet backbone. I really like the animation explaining this article with nice pictures. In short, BGP has a bug that potentially created a huge outage in August 2020. The proposed fix is to imrove the BGP protocol with a new feature. It&#x27;s not easy because, it&#x27;s the backbone of internet. Let&#x27;s see where this will go.
评论 #26889359 未加载
michaelbuckbeeabout 4 years ago
So I keep coming into situations where I think this is the problem that&#x27;s occurring (a stuck route). While I&#x27;d certainly love to be able to diagnosis this, would it even matter? There&#x27;s no recourse that I can take as an end user is there?
评论 #26896697 未加载
评论 #26908953 未加载
pasabout 4 years ago
... hm, how come withdraw (and announce) messages are not ACKed in-band? or maybe they are, but due to explicit demonic of certain routers (and&#x2F;or ASes) they still don&#x27;t take effect?