TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How discord stores billions of messages (2017)

204 点作者 greymalik超过 2 年前

12 条评论

jpalomaki超过 2 年前
This is quite valuable advise: &quot;The original version of Discord was built in just under two months in early 2015. Arguably, one of the best databases for iterating quickly is MongoDB. Everything on Discord was stored in a single MongoDB replica set and this was intentional, but we also planned everything for easy migration to a new database&quot;<p>Also the article links to Twitter blog, which gives similar point (it&#x27;s from 2010): &quot;We [Twitter] currently use MySQL to store most of our online data. In the beginning, the data was in one small database instance which in turn became one large database instance and eventually many large database clusters&quot; [1]<p>[1] <a href="https:&#x2F;&#x2F;blog.twitter.com&#x2F;engineering&#x2F;en_us&#x2F;a&#x2F;2010&#x2F;announcing-snowflake" rel="nofollow">https:&#x2F;&#x2F;blog.twitter.com&#x2F;engineering&#x2F;en_us&#x2F;a&#x2F;2010&#x2F;announcing...</a>
评论 #32607387 未加载
评论 #32616685 未加载
chrsig超过 2 年前
&gt; Nothing was surprising, we got exactly what we expected.<p>Such a satisfying feeling in the engineering world.<p>&gt; We noticed Cassandra was running 10 second “stop-the-world” GC constantly but we had no idea why.<p>This makes me very thankful for the work that the Go team has put into the go GC.<p>&gt; In the scenario that a user edits a message at the same time as another user deletes the same message, we ended up with a row that was missing all the data except the primary key and the text since all Cassandra writes are upserts.<p>Does cassandra not offer a mechanism to do a conditional update? I&#x27;d expect to be able to submit a upsert that fails if the row isn&#x27;t present, or has a `deleted = true` field, or something to that effect.
评论 #32610347 未加载
评论 #32609051 未加载
评论 #32610425 未加载
judge2020超过 2 年前
(2017)<p>As revealed in this blog post[0], they now see 4 billion messages per day.<p>0: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32474093" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32474093</a>
评论 #32607052 未加载
评论 #32607995 未加载
raffraffraff超过 2 年前
I haven&#x27;t used Cassandra for about 3 years, and this is awakening memories. At a previous company I inherited a very badly assembled cluster that was used for a time series database. The guys who built it said (and no, I&#x27;m not kidding...) &quot;we don&#x27;t need to put a TTL on metrics because they&#x27;re tiny and anyway we can just add more nodes and scale the cluster horizontally forever!&quot;. Well, forever was about 2 years, when the physical data center ran out of rack space, and teams abused the metrics system with TBs of bullshit data. That was when they handed the whole metrics system to yours truly. And I discover two things:<p>1. You can&#x27;t just bulk delete a year of old stale data without breaking it<p>2. &quot;Woops, did we really set replication factor to 1?&quot;<p>Fun.
评论 #32611998 未加载
oreally超过 2 年前
I don&#x27;t know databases and there&#x27;s quite a number of them on the market, so posts like these are great.<p>Sidetrack though, does anyone have a list for pros and cons of each db, with a preference towards low latency? Also how does it compare with say Postgres?
评论 #32607178 未加载
评论 #32610012 未加载
coldblues超过 2 年前
Unfortunately, they&#x27;re not doing a good job at deleting them. If you press the &quot;Delete Account&quot; button, all it does is anonymize your profile, and leaves all of your messages intact. One of the reasons I avoid using Discord whenever possible.
评论 #32607290 未加载
评论 #32616173 未加载
paxys超过 2 年前
As a point of comparison, Slack uses MySQL (Vitess) – <a href="https:&#x2F;&#x2F;slack.engineering&#x2F;scaling-datastores-at-slack-with-vitess&#x2F;" rel="nofollow">https:&#x2F;&#x2F;slack.engineering&#x2F;scaling-datastores-at-slack-with-v...</a>
Jamie9912超过 2 年前
I believe they use ScyllaDB exclusively now for storing messages
mannyv超过 2 年前
&#x27;build quickly to prove out a product feature, but always with a path to a more robust solution&quot;<p>Yes
评论 #32613844 未加载
unlog超过 2 年前
Discord doesn&#x27;t respect privacy, you cannot just get rid of a whole conversation. Users are the product, and they make it so difficult to delete entire convos that it&#x27;s so obvious it&#x27;s just valuable to them.
评论 #32607729 未加载
评论 #32607807 未加载
评论 #32608561 未加载
hestefisk超过 2 年前
Wonder if Postgres would scale to such volume.
评论 #32611043 未加载
seydor超过 2 年前
how many are bots? AFAIK bot traffic is higher than human there. Plus they added so much bot APIs that now bots are hacking and spamming users&#x27; accounts
评论 #32606995 未加载
评论 #32611102 未加载