TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Lichess: Post-Mortem of Our Longest Downtime

176 点作者 jpablo8 个月前

5 条评论

carlsborg8 个月前
The main lichess engine (lila, open source) is a single monolith program that&#x27;s deployed on a single server. It serves ~5 million games per day. But there are a several other pieces too. They discuss the architecture here <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=crKNBSpO2_I" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=crKNBSpO2_I</a><p>BTW consider donating if you use lichess.
评论 #41588906 未加载
评论 #41589922 未加载
评论 #41593114 未加载
评论 #41604932 未加载
theideaofcoffee8 个月前
I guess some of my questions are addressed in the latter half of the post, but I&#x27;m still puzzled why a prominent service didn&#x27;t have a plan for what looked like a run of the mill hardware outage. It&#x27;s hard to know exactly what happened as I&#x27;m having trouble parsing some of the post (what is a &#x27;network connector&#x27;? is it a cable? nic?). What were some of the &#x27;increasingly outlandish&#x27; workarounds? Are they actually standing up production hosts manually, and was that the cause of a delay or unwillingness to get new hardware goin? I think it would be important to have all of that set down either in documentation or code seeing as most of their technical staff are either volunteers, who may come and go, or part timers. Maybe they did, it&#x27;s not clear.<p>It&#x27;s also weird seeing that they are still waiting on their provider to tell them exactly what was done to the hardware to get it going again, that&#x27;s usually one of the first things a tech mentions: &quot;ok, we replaced the optics in port 1&quot; or &quot;I replaced that cable after seeing increased error rates&quot;, something like that.
评论 #41597966 未加载
holsta8 个月前
This response and post-mortem is superior to most commercial services I have seen in recent years.
评论 #41591941 未加载
评论 #41591295 未加载
评论 #41592159 未加载
评论 #41593180 未加载
ctippett8 个月前
Once the private link was reestablished, could they not have tunneled out to the internet via another server acting as a sort of gateway?<p>Disclaimer: I&#x27;m not a network engineer so I may be misunderstanding the practicality and complexity of such a workaround.
lazyant8 个月前
summary for the lazy: OVH