TechEcho

7 comments

eikenberryover 10 years ago

From everything I've read Kafka is a really bad fit for AWS. It is not tolerant of partitioning. They stated this in their own design document where they present it as a CA system. In his Jepsen post on Kafka, Kyle backed this up with more data.Given this, why do people deploy it to AWS? It seems like an invitation to disaster.

评论 #8589154 未加载

评论 #8588945 未加载

评论 #8588732 未加载

nostrademonsover 10 years ago

Curious whether Cap'n Proto or another zero-copy serialization format might've been a better choice than protobufs? Protobufs still need to parse the message, it's just that the code to do so is automatically generated for you. With Cap'n Proto you can just read them directly off the wire and save them, or mmap a file full and access them.Most of the downsides of Cap'n Proto also don't apply here. Compressing with Snappy will elide all the zero-valued padding bytes. The format of an HTTP message is relatively stable, so you don't get a lot of churn in the message layout. HTTP doesn't have a lot of optional fields, so that's another potential source of Cap'n Proto bloat that doesn't apply to your use case.

felipesabinoover 10 years ago

My lazy self always wonder how nice it would be if some of these infrastructure designs were always accompanied with a docker/fig configuration example to be used as a start point/proof of concept for people looking for similar solutions.It obviously happens some times [1] [2], but it should be more common...[1] <a href="http://alvinhenrick.com/2014/08/18/apache-storm-and-kafka-cluster-with-docker/" rel="nofollow">http://alvinhenrick.com/2014/08/18/apache-storm-and-kafka-cl...</a>[2] <a href="https://registry.hub.docker.com/u/ches/kafka/" rel="nofollow">https://registry.hub.docker.com/u/ches/kafka/</a>

评论 #8589931 未加载

评论 #8588724 未加载

zeropover 10 years ago

We use netty for transport in similar scenario. Though we have not hard-tested it with the limits mentioned but wouldn't a write-behind cache can write large volume of data..ofcourse there will be a delay but it is not hard to implement.

eva1984over 10 years ago

Just curious how does Kafka handle data rentention though? Can it be easily configured? Or you need to build something from scratch?

评论 #8589942 未加载

评论 #8588693 未加载

hbzover 10 years ago

I was hoping he'd post the http-to-kafka adapter but I'm guessing that's ChartBeat IP.

评论 #8589878 未加载

suchitpuriover 10 years ago

One thing which is not clear about kafka or kinesis is when you have multiple consumers for the same topic how will they get the data and in what order , and what happens when consumers die down. How do you handle consumers in your data pipeline ?

评论 #8588995 未加载

评论 #8588984 未加载

7 comments

eikenberryover 10 years ago

评论 #8589154 未加载

评论 #8588945 未加载

评论 #8588732 未加载

nostrademonsover 10 years ago

felipesabinoover 10 years ago

评论 #8589931 未加载

评论 #8588724 未加载

zeropover 10 years ago

eva1984over 10 years ago

Just curious how does Kafka handle data rentention though? Can it be easily configured? Or you need to build something from scratch?

评论 #8589942 未加载

评论 #8588693 未加载

hbzover 10 years ago

I was hoping he'd post the http-to-kafka adapter but I'm guessing that's ChartBeat IP.

评论 #8589878 未加载

suchitpuriover 10 years ago

评论 #8588995 未加载

评论 #8588984 未加载

Infrastructure for Data Streams

7 comments

Infrastructure for Data Streams

7 comments