> But if you could rebuild streaming from the ground up for the cloud, you could achieve something a lot better than fewer disks – zero disks. The difference between some disks and zero disks is night and day. Zero disks, with everything running directly through object storage with no intermediary disks, would be better.<p>That's still a trade-off. Object storage, simply by the overhead of HTTP + SSL, has higher latency than EFS, which has higher latency than EBS, which has higher latency than local SSD. So in the end your service (no matter if it's Kafka or anything else) has _higher_ latency if you also want consistency (aka resilience against "everything goes dark in an instant") as all writes on all machines in the pool have to be committed to storage.<p>The only way a "zero disk" <i>anything</i> makes sense is if you have enough machines in enough diverse locations with enough RAM to cover the entire workload and to pray there's never any event taking the entire cloud provider offline.
I agree but I hate it when at the end of an article I realize it was just an ad.<p>The conflict of interest should be disclaimed in the very first sentence of the post.
> taken to its logical conclusion, tiered storage could turn Kafka into [...]<p>A message broker sitting in front of an RDBMS. I mean, if we're now basically 'tailing' streaming data and saving to another storage system might as well use RabbitMQ.
The article doesn't mention which EBS volume type was used, but since Provisioned IOPS are mentioned, I assume it's gp3 or io2. One pattern that is especially often used in Time Series databases, but could work for Kafka too, is not tiering down to S3, but changing older volumes to a slower volume type, such as sc1 ($0.015/GiB-Mo). This can be done completely transparently to the application.<p>Another thing worth looking into is S3 Mountpoint with or without read caching, which offers a POSIX-like interface for S3 to applications that don't natively support S3.
Have you seen AutoMQ's approach(<a href="https://github.com/AutoMQ/automq">https://github.com/AutoMQ/automq</a>)?
It is hard to believe that users can tolerate a produce message latency of hundreds of milliseconds with WarpStream. As the Co-founder & CEO of AutoMQ, we have engaged with hundreds of users, many of whom are seeking both speed and reliability. So We require a stateless broker solution that is fully compatible with Apache Kafka, while also excelling in terms of low latency and cost effectiveness on cloud infrastructure.