TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Centrifuge: a reliable system for delivering billions of events per day

156 点作者 bretthoerner将近 7 年前

13 条评论

mnutt将近 7 年前
I&#x27;m currently investigating a very similar problem (very high throughput webhooks, specified by customers, to unreliable endpoints) and was considering an architecture involving a set of queues partitioned by webhook response time and&#x2F;or failure rate.<p>So if your webhook is bucketed into the 0-100ms queue and your responses start to exceed 100ms, you&#x27;d be bumped up to the 100-500ms queue which is more likely to have periodic queuing delays, and upwards from there depending on your response time &#x2F; failure rate. If your API later recovered and started responding faster you&#x27;d be moved back up into the faster queue. That way we could offer different (soft) SLAs for different classes of response times, and scale the workers independently per queue.<p>I&#x27;m curious if there are known issues that people have run into with this approach? The main unknown was how many workers we&#x27;d be willing to throw at consistently slow endpoints to try to keep the slower queues from backing up too much, and possibly some flapping as endpoints could respond quickly under low throughput but slow down as soon as they move up to the faster queues.
评论 #17140880 未加载
ryanwaggoner将近 7 年前
Ugh. This is only tangentially related, but Segment has always looked great, but is so incredibly, ridiculously overpriced for web analytics.<p>If you want to slap it on your website for basic analytics, you better not get any traffic to speak of, because they charge you about $.01 per monthly tracked user, anonymous or identified.<p>If you get 100k visitors per month (not even THAT much), it&#x27;s $1,125&#x2F;mo. Their pricing estimator stops at $2,375 &#x2F; month for 225k monthly tracked users.<p>God forbid you put it on your site or in your mobile app and then anything you do goes viral. It&#x27;ll bankrupt you.<p>Who on earth is this aimed at? Is this only suitable for tracking logged in users? Or only suitable for enterprise companies?<p>Better yet, what about what they&#x27;re doing is so difficult and expensive, when many of the places you&#x27;d be sending these analytics (which will be handling the same number of events) are free or cheap at this level of use?<p>Segment has always seemed cool, but overpriced for web analytics by a factor of 20-50x.<p>OK, rant over.
评论 #17140360 未加载
评论 #17139294 未加载
manigandham将近 7 年前
&gt;&gt; To implement per source-destination queues with full isolation, we’d need hundreds of thousands of different queues. Across Kafka, RabbitMQ, NSQ, or Kinesis–we haven’t seen any queues which support that level of cardinality with simple scaling primitives.<p>I&#x27;ve been posting this a lot recently but it keeps coming up as the relevant solution, Apache Pulsar supports millions of topics without much overhead and offers the log semantics of Kafka with better scaling and per-message acknowledgement: <a href="https:&#x2F;&#x2F;pulsar.incubator.apache.org" rel="nofollow">https:&#x2F;&#x2F;pulsar.incubator.apache.org</a>
calvinfo将近 7 年前
Hey HN, author of the post here. A number of Segment engineers are hanging around today, and we are happy to answer questions in the comments. Thanks in advance for any feedback and thoughtful discussion!
评论 #17137297 未加载
评论 #17137147 未加载
评论 #17137351 未加载
jbs40将近 7 年前
I wonder if this would have built on Apache Pulsar (<a href="https:&#x2F;&#x2F;pulsar.apache.org" rel="nofollow">https:&#x2F;&#x2F;pulsar.apache.org</a>) if it had been in open source and on the Segment team&#x27;s radar at the time they started on Centrifuge. I work with one of the architects of Pulsar, and his first thought on seeing the blog was that Segment&#x27;s scenario had a lot of similarities to what he and the team and Yahoo set out to do when they first built Pulsar there several years ago.
whalesalad将近 7 年前
My spidey sense is telling me that this could have been achieved with a few Erlang VMs and a lot less moving parts.<p>It&#x27;s a supervisor (Director) with a bunch of actors communicating with different 3rd party API&#x27;s. The state mechanism could ostensibly get abstracted away so the underlying DB is irrelevant.
评论 #17136953 未加载
评论 #17137660 未加载
temuze将近 7 年前
Great post!<p>&gt; To keep the ‘small working set’ even smaller, we cycle these JobDBs roughly every 30 minutes. The manager cycles JobDBs when their target of filled percentage data is about to exceed available RAM.<p>I&#x27;m confused - why does JobDB&#x27;s memory trickle up over time? Isn&#x27;t it a database? Are you using MySQL&#x27;s memory storage engine or something?
评论 #17139771 未加载
Zaheer将近 7 年前
Awesome write-up! Curious on the decision to choose MySQL since it&#x27;s a write-heavy load with minimal querying. Would something like Redis be better suited? I&#x27;m not super familiar with Redis so just curious about what other DBs were considered.
评论 #17146868 未加载
sjeanpierre将近 7 年前
Hi, thanks for the great write up. Can you provide some more details about how you handle the RDS side of the rotation process and maintaining the spares?
siscia将近 7 年前
Thanks for the post! It was really a good read.<p>In a similar position I would have tried MQTT that should handle quite well a great number of topics.<p>Did you guys tried such protocol?
评论 #17139804 未加载
tekmaven将近 7 年前
By age 35 you should have written your own queuing system at least once.
评论 #17139292 未加载
scrollaway将近 7 年前
Ok, I&#x27;ve read through most of this (really cool post btw!) and I still can&#x27;t figure out if this project is using the Centrifuge stack or not:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;centrifugal" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;centrifugal</a><p>It looks like not, but that&#x27;s a hell of a naming overlap.<p>I looked at Centrifugo a few years ago to deliver live Hearthstone games (in game replay format) through the web. It&#x27;s a pretty sweet project.
评论 #17136874 未加载
评论 #17136934 未加载
carapace将近 7 年前
What is with the rash of bad naming decisions these days?<p>It started with &quot;Cucumber&quot; or &quot;Celery&quot; or something a few years ago didn&#x27;t it?<p>I&#x27;ll skip ranting about how &quot;Go&quot; was a terrible name for a PL (and only mention parenthetically how they gaffled that name from a different PL!) They have &quot;Grumpy&quot;, and something called &quot;Thanos&quot; (good luck searching for that until the hype for that comic book movie dies down.) I feel like I&#x27;ve seen several other projects recently that have been named after other things.
评论 #17140607 未加载