TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Tiered storage won't fix Kafka

65 点作者 itunpredictable大约 1 年前

7 条评论

mschuster91大约 1 年前
&gt; But if you could rebuild streaming from the ground up for the cloud, you could achieve something a lot better than fewer disks – zero disks. The difference between some disks and zero disks is night and day. Zero disks, with everything running directly through object storage with no intermediary disks, would be better.<p>That&#x27;s still a trade-off. Object storage, simply by the overhead of HTTP + SSL, has higher latency than EFS, which has higher latency than EBS, which has higher latency than local SSD. So in the end your service (no matter if it&#x27;s Kafka or anything else) has _higher_ latency if you also want consistency (aka resilience against &quot;everything goes dark in an instant&quot;) as all writes on all machines in the pool have to be committed to storage.<p>The only way a &quot;zero disk&quot; <i>anything</i> makes sense is if you have enough machines in enough diverse locations with enough RAM to cover the entire workload and to pray there&#x27;s never any event taking the entire cloud provider offline.
评论 #40199614 未加载
评论 #40209454 未加载
knur大约 1 年前
I agree but I hate it when at the end of an article I realize it was just an ad.<p>The conflict of interest should be disclaimed in the very first sentence of the post.
评论 #40200152 未加载
评论 #40201265 未加载
评论 #40200426 未加载
temporarely大约 1 年前
&gt; taken to its logical conclusion, tiered storage could turn Kafka into [...]<p>A message broker sitting in front of an RDBMS. I mean, if we&#x27;re now basically &#x27;tailing&#x27; streaming data and saving to another storage system might as well use RabbitMQ.
评论 #40202403 未加载
评论 #40200199 未加载
kdavyd大约 1 年前
The article doesn&#x27;t mention which EBS volume type was used, but since Provisioned IOPS are mentioned, I assume it&#x27;s gp3 or io2. One pattern that is especially often used in Time Series databases, but could work for Kafka too, is not tiering down to S3, but changing older volumes to a slower volume type, such as sc1 ($0.015&#x2F;GiB-Mo). This can be done completely transparently to the application.<p>Another thing worth looking into is S3 Mountpoint with or without read caching, which offers a POSIX-like interface for S3 to applications that don&#x27;t natively support S3.
评论 #40199940 未加载
jackbauer24大约 1 年前
Have you seen AutoMQ&#x27;s approach(<a href="https:&#x2F;&#x2F;github.com&#x2F;AutoMQ&#x2F;automq">https:&#x2F;&#x2F;github.com&#x2F;AutoMQ&#x2F;automq</a>)? It is hard to believe that users can tolerate a produce message latency of hundreds of milliseconds with WarpStream. As the Co-founder &amp; CEO of AutoMQ, we have engaged with hundreds of users, many of whom are seeking both speed and reliability. So We require a stateless broker solution that is fully compatible with Apache Kafka, while also excelling in terms of low latency and cost effectiveness on cloud infrastructure.
评论 #40276104 未加载
Foobar8568大约 1 年前
And everything is an abstract layer of CSV over SFTP &#x2F;soff
msarrel大约 1 年前
Tiered storage won&#x27;t save Kafka but we built the exact same thing while thinking differently so we&#x27;re better. Got it!
评论 #40276184 未加载
评论 #40200472 未加载