TechEcho

7 comments

> But if you could rebuild streaming from the ground up for the cloud, you could achieve something a lot better than fewer disks – zero disks. The difference between some disks and zero disks is night and day. Zero disks, with everything running directly through object storage with no intermediary disks, would be better.That's still a trade-off. Object storage, simply by the overhead of HTTP + SSL, has higher latency than EFS, which has higher latency than EBS, which has higher latency than local SSD. So in the end your service (no matter if it's Kafka or anything else) has _higher_ latency if you also want consistency (aka resilience against "everything goes dark in an instant") as all writes on all machines in the pool have to be committed to storage.The only way a "zero disk" anything makes sense is if you have enough machines in enough diverse locations with enough RAM to cover the entire workload and to pray there's never any event taking the entire cloud provider offline.

评论 #40199614 未加载

评论 #40209454 未加载

knurabout 1 year ago

I agree but I hate it when at the end of an article I realize it was just an ad.The conflict of interest should be disclaimed in the very first sentence of the post.

评论 #40200152 未加载

评论 #40201265 未加载

评论 #40200426 未加载

temporarelyabout 1 year ago

> taken to its logical conclusion, tiered storage could turn Kafka into [...]A message broker sitting in front of an RDBMS. I mean, if we're now basically 'tailing' streaming data and saving to another storage system might as well use RabbitMQ.

评论 #40202403 未加载

评论 #40200199 未加载

kdavydabout 1 year ago

The article doesn't mention which EBS volume type was used, but since Provisioned IOPS are mentioned, I assume it's gp3 or io2. One pattern that is especially often used in Time Series databases, but could work for Kafka too, is not tiering down to S3, but changing older volumes to a slower volume type, such as sc1 ($0.015/GiB-Mo). This can be done completely transparently to the application.Another thing worth looking into is S3 Mountpoint with or without read caching, which offers a POSIX-like interface for S3 to applications that don't natively support S3.

评论 #40199940 未加载

jackbauer24about 1 year ago

Have you seen AutoMQ's approach(<a href="https://github.com/AutoMQ/automq">https://github.com/AutoMQ/automq</a>)? It is hard to believe that users can tolerate a produce message latency of hundreds of milliseconds with WarpStream. As the Co-founder & CEO of AutoMQ, we have engaged with hundreds of users, many of whom are seeking both speed and reliability. So We require a stateless broker solution that is fully compatible with Apache Kafka, while also excelling in terms of low latency and cost effectiveness on cloud infrastructure.

评论 #40276104 未加载

Foobar8568about 1 year ago

And everything is an abstract layer of CSV over SFTP /soff

msarrelabout 1 year ago

Tiered storage won't save Kafka but we built the exact same thing while thinking differently so we're better. Got it!

评论 #40276184 未加载

评论 #40200472 未加载

7 comments

mschuster91about 1 year ago

评论 #40199614 未加载

评论 #40209454 未加载

knurabout 1 year ago

I agree but I hate it when at the end of an article I realize it was just an ad.The conflict of interest should be disclaimed in the very first sentence of the post.

评论 #40200152 未加载

评论 #40201265 未加载

评论 #40200426 未加载

temporarelyabout 1 year ago

评论 #40202403 未加载

评论 #40200199 未加载

kdavydabout 1 year ago

评论 #40199940 未加载

jackbauer24about 1 year ago

评论 #40276104 未加载

Foobar8568about 1 year ago

And everything is an abstract layer of CSV over SFTP /soff

msarrelabout 1 year ago

Tiered storage won't save Kafka but we built the exact same thing while thinking differently so we're better. Got it!

评论 #40276184 未加载

评论 #40200472 未加载

Tiered storage won't fix Kafka

7 comments

Tiered storage won't fix Kafka

7 comments