Why I am not a fan of Apache Kafka

70 pointsby kornishover 8 years ago

16 comments

This article is pretty out of date, I think the central concerns have actually been addressed.It's true that when we were working at LinkedIn Kafka tended to have much better Java support. Since founding Confluent (I'm one of the co-founders) we've really focused on improving the situation outside Java.A few specific corrections:1. We added full support for consumers with no interaction with zookeeper in the main kafka protocol. There is no longer any direct interaction with zookeeper from either the producer or consumer. We did this because we care a lot about the non-java clients.2. Kafka has been extremely disciplined about backwards compatibility. The protocol comes with versioning and changes are always implemented in a way that supports both the old and new version and can be rolled out without downtime. In the five year history of the project we did one backwards incompatible release--the break from 0.5.x-0.7.x to 0.8.x. This was done intentionally to allow us to refactor the apis. I think this is a pretty good track record.It's worth also addressing why Kafka clients directly access nodes in the cluster rather than requiring a proxy layer. The reason we do this is to allow very high throughput, partition aware processing. This is really required for use cases like stream processing that need to process data efficiently, especially in cases where you are reprocessing data. You can always build a proxy layer on top of direct access but not vice versa.Confluent (where I work) is doing two things that help the non-java client ecosystem: 1. We maintain an open source REST proxy that provides decoupled access (albeit with a little overhead compared to the direct clients) 2. We have picked up work on clients. We offer and fully support a c/c++ client, a python client, and have a Go client coming soon. All of these are in feature parity with the Java clients. (More on the way).Both of these efforts are open source and apache licensed and included in the open source Confluent Platform distribution of Kafka.

评论 #12537009 未加载

评论 #12534248 未加载

ah-over 8 years ago

This needs a [2015]. Back then there wasn't a single usable client for .net.Since then the non-Java clients have massively improved. In particular <a href="https://github.com/edenhill/librdkafka" rel="nofollow">https://github.com/edenhill/librdkafka</a> is fantastic. There's also <a href="https://github.com/dpkp/kafka-python/" rel="nofollow">https://github.com/dpkp/kafka-python/</a>, with support for consumer groups and all the other modern features.The basic criticism of requiring a complex client is valid, however you cannot achieve the delivery guarantees that Kafka gives you without one. The alternative would be to have a local agent process like consul, but that wouldn't give you the throughput that Kafka gets.Disclaimer: I've built a C# client for Kafka based on librdkafa (<a href="https://github.com/ah-/rdkafka-dotnet" rel="nofollow">https://github.com/ah-/rdkafka-dotnet</a>), so I'm biased.

评论 #12533788 未加载

评论 #12533905 未加载

评论 #12533790 未加载

评论 #12533676 未加载

agentgtover 8 years ago

I think in large part why people dislike Kafka is that they don't really need Kafka (and the complexity that comes with it).Don't get me wrong Kafka is good tech if really need that level of throughput but I honestly think most companies don't have that much data and/or just putting too much in the pipe. But they go with Kafka anyway I guess to "CYA" for future scaling only to find out Kafka is complicated.I mentioned this earlier (a couple days ago <a href="https://news.ycombinator.com/item?id=12520159" rel="nofollow">https://news.ycombinator.com/item?id=12520159</a>) for someone using Kafka for a logging aggregation system only to drop it for ZMQ.My other point is if your endpoint isn't fast enough it doesn't really matter what your pipe is. Pick an easy to use pipe first (like RMQ) and worry about scaling the endpoints.

评论 #12533726 未加载

评论 #12534173 未加载

Xorlevover 8 years ago

Author doesn't understand Kafka, doesn't have a good client for his language, therefore doesn't like Kafka.Jay Kreps responds to a few of his points -- the complexity of the client is for scalability reasons.> When you Produce a Message Set onto the bus, you don't directly get back a response telling you that the messages have successfully been persisted to one or more partitions.At least in the Java client, this isn't true. True, if you used the async API before 0.9 you weren't able to get an ACK, but the sync producer would block until a message was published. In the new consumer, you're handed futures + the ability to provide a callback[1].[1] <a href="http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html" rel="nofollow">http://kafka.apache.org/082/javadoc/org/apache/kafka/clients...</a>

评论 #12533751 未加载

slap_shotover 8 years ago

I work with Kafka every day and don't really think the OP's concerns (it's too complex and there isn't third party driver support to the degree of Redis) are too serious. They both will be solved with time.My bigger fear is Confluent - the private company founded by many Kafka core committers and employing many Kafka committers.Confluent offers open source extensions to Kafka's core in the form of connectors (boiler plate code to connect to common sources and sinks like JDBC databases, files, and Hadoop).Confluent also offers (as of right now) one closed source product extension (Control Center - a cluster monitoring system similar to the management UI of RabbitMQ, etc) that requires enterprise subscription (several thousand dollars per node per year) after a 30 day trial.$30.9MM for a service/support based company seems like a lot of money and a drives a very high valuation that needs to show return. I personally am skeptical of service/support venture backed model[0].My fear is that Kafka will increasingly require "enterprise" support tools with less and less support and features available to people who do not pay for enterprise support. The amount of documentation of the 0.10 release (particularly the Streams API) that resided on the Confluent page versus the Kafka page is a HUGE red flag to me.I have all the respect in the world for Jay, and the Kafka/Confluent team, but I find myself avoiding Confluent's tools (Kafka Connect and Schema Registry) because of fear that those will eventually be closed source or require an enterprise subscription.[0] I'm not an investor but I haven't seen many these models work out in the long run. A recent Podcast by A16Z touches on this subject very well, with an A16Z partner saying he believes exactly one company has pulled this model off well at venture scale - Red Hat. <a href="http://a16z.com/2016/08/19/pricing-freemium-premium-opensource/" rel="nofollow">http://a16z.com/2016/08/19/pricing-freemium-premium-opensour...</a>

评论 #12534239 未加载

评论 #12533859 未加载

admnorover 8 years ago

Hi. Author of the article/gist here. It was actually written in Spring of last year, I think. I've just updated it because, as people have said, many of the issues have been addressed one way or the other:* The KafkaREST proxy makes life a little easier * librdkafka makes life a lot easier, especially when you can just download a thin wrapper around it for your chosen language * I am no longer working at the place that chose Kafka 0.8 following no testing at all for their use case, and refused to back down through months of hell both writing code and trying to keep clusters available * The people at Confluent have done a lot of good work since then, both on Kafka itself and on various auxiliary tools/products.So yeah, I would probably look at Kafka again today if I needed that kind of functionality. Screw ZooKeeper though.

reitanqildover 8 years ago

To be honest I think very many of us don't need Kafka. As with many other things as long as we aren't handling more than a few thousands messages/sec any decent ordinary nessage broker like ActiveMQ should do.Caveat: do not install production message bus in a vm.I recently listened to this talk which compares and explains very nicely: <a href="https://vimeo.com/181925293" rel="nofollow">https://vimeo.com/181925293</a>

评论 #12533664 未加载

评论 #12533844 未加载

yummyfajitasover 8 years ago

I find the complaints that Linkedin hasn't open sourced their REST client to be a little silly. I'm in a similar situation - PHP needs to send commands into Kafka but PHP kafka libs aren't great (or maybe there are and my PHP guys don't want to use them).So I wrote a little Thrift endpoint (in Scala) which receives messages and writes to kafka. It's under 100 lines of code. Probably another couple of hundred lines for the PHP version of the thrift client.Are we really complaining that Linkedin hasn't open sourced their 200 loc rest client?

评论 #12533798 未加载

评论 #12533964 未加载

KirinDaveover 8 years ago

"Really new" around November 2015. It was released in 2011 though. It's only 2 years younger than Redis, and is primarily an exercise in using Zookeeper.It's very surprising to hear people suggest Redis pubsub is a valid substitute for Kafka, when in fact it's not. It has a fundamentally different set of operating characteristics, a different sweet spot. Kafka isn't great from a consumer resumption point of view, but at least there ARE options.It's also untrue that Kafka gives no feedback on a successful message put. This is obviously a bug or design shortcoming in the post author's chosen toolchain which is correctable, and was part of the core java toolchain as of late 2015 AFAIK.I do agree that the Kafka system has an architecture that maximizes difficulty for new language bindings. Certainly C# has the tools to write an excellent implementation, assuming someone understands zookeeper well enough.

评论 #12534705 未加载

notacowardover 8 years ago

I liked "behind a load balancer like a normal server" the best. How does the author think a load balancer works? By making an even less accurate guess about the state of servers behind it than Kafka can do via Zookeeper. AFAICT the author is just upset that Kafka isn't exactly like Redis, whereas most sane people would be quite glad it's not.

hifierover 8 years ago

Seems like mostly FUD. No mention of specific issues in any clients. And please have a look at other high throughput systems (like Cassandra or VoltDB) before claiming that a load balancer is the proper way to connect clients to a distributed system.

评论 #12533675 未加载

falcolasover 8 years ago

Quick note - the author just did an update for an "As Of Sept 2016".TL;DR: Still not a great solution for their original problem. The development of good C/C++ libraries means he could now get around the lack of decent C# libraries. Overall architecture still pretty f'ed.

fusiongyroover 8 years ago

I don't have as much depth with it as the author, but I also felt like using it was kind of a bait and switch, especially coming from having read (and loved) Jay's book I Heart Logs. We're using it at my work but I'm not really in love with it and will probably be trading it out for AMQP for an event system we're planning.I was working with a junior engineer on this project, and he kept on getting confused about what features were ZooKeeper and which were Kafka. There are two complex technologies here as a first hurdle to using it. This isn't ideal.In the book he describes a scenario where your stream processors record to their own storage where they are in the stream, but Kafka's stock consumers now seem to keep track of that in ZooKeeper instead, which seems like an odd place to make the decision.We initially had a small Node.js server, just to experiment with, but I discovered that it had no error handling at all. I could put fake hostnames in and it would just hang out, as if eventually maybe they would appear and it could connect. This is really the Kafka driver's fault; we switched back to Java and the Java client worked, it's just a little overcomplex. But we also periodically came in to find the server had crashed. I still don't know why. (I'm open to it being our general ignorance and a misconfiguration or something.)In the book, Jay describes this beautiful computing model where you have these log streams and you just process them, and it's high-level and very alluring. The actual APIs that Kafka gives you are not beautiful or intuitive. Rewinding to the beginning is something you can only do after you read, for instance. We were thinking of using it like an external write-ahead-log (as described in the book) but it just doesn't really support that use-case directly through its API.It's kind of a shame, because AMQP doesn't support that use case all that well. I believe you have to decide whether you want your queue to act like a round-robin affair or as a persistent queue. Kafka sort of lets you have both; streams (ostensibly) work like persistent broadcast queues. I don't think I'll be able to use AMQP as a write-ahead-log by itself; probably I'll have to have some kind of mediating service that's just recording events to persistence and have a separate way of getting historical stuff.I spent a year or so unable to work on Kafka but telling everyone to read I Heart Logs, so getting in there a few months ago and seeing how wide the gap is between the beautiful theory and the practice has been disillusioning. Frankly, the actual system and the one in the book are pretty radically divergent. I am still a big fan of the system described in the book. I hope someday I get to use it.

评论 #12534269 未加载

jknoepflerover 8 years ago

The linked article now includes a giant disclaimer on top more or less retracting the view expressed in the title. Please update the title to accurately reflect the linked content. Also note that the author is mostly griping because of issues which no longer exist. I've posted the author's words below:"Update, September 2016OK, you can pretty much ignore what I wrote below this update, because it doesn't really apply anymore.I wrote this over a year ago, and at the time I had spent a couple of weeks trying to get Kafka 0.8 working with .NET and then Node.js with much frustration and very little success. I was rather angry. It keeps getting linked, though, and just popped up on Hacker News, so here's sort of an update, although I haven't used Kafka at all this year so I don't really have any new information.In the end, we managed to get things working with a Node.js client, although we continued to have problems, both with our code and with managing a Kafka/Zookeeper cluster generally. What made it worse was that I did not then, and do not now, believe that Kafka was the correct solution for that particular problem at that particular company. What they were trying to achieve could have been done more simply with any number of other messaging systems, with a subscriber reading messages off and writing them to some form of persistent storage (like Elasticsearch). I'm sure there are issues of scale or whatever where Kafka makes sense.It is true, as many people have pointed out in the comments, that my primary problem was the lack of a good Kafka client for .NET. If I'd been able to install a Kafka Nuget package and it had just worked, this would never have been written. But I couldn't. Today I could probably use a thin wrapper around librdkafka, and if I ever have to work with Kafka from .NET again, that's probably what I'll do. C/C++ libraries are great for stuff like that: C can talk to anything, and everything can talk to C. Yay.I do understand the performance-related reasons that drove the decision to design a clever-client architecture, but it was, apparently, extremely difficult to create a good client unless you were working with either Java, or with a lower-level language such as C or Go which could work with the complex protocols and implementation requirements.So, anyway, like I said, you can ignore the stuff below which was written about an old version of the software, while I was in a very bad mood. But I'm going to leave it here, in the hopes that it may serve as a warning to future developers of really complicated infrastructure components. It probably won't, though."

thomasleeover 8 years ago

> When you Produce a Message Set onto the bus, you don't directly get back a response telling you that the messages have successfully been persisted to one or more partitions. Instead, you must also Consume the bus, and you should eventually receive multiple messages acknowledging the persistence of each message in the set.Maybe this has changed recently, but IIRC this isn't true if your ProducerRequest has the ack bit set to 1 or 2 (i.e. leader or replica acking):<a href="https://github.com/confluentinc/kafka/blob/79aaf19f24bb48f90404a3e3896d115107991f4c/core/src/main/scala/kafka/api/ProducerRequest.scala#L60" rel="nofollow">https://github.com/confluentinc/kafka/blob/79aaf19f24bb48f90...</a>The response/ack is sent directly over the socket sending the request.> If a Node dies then a "leadership election" happens, ZooKeeper is updated with the new metadata, and your application must react to this and handle the changes. There's a six second delay while this happensNot that I doubt it, but not sure where six seconds comes from here ... perhaps waiting for partition leader elections? It's been long enough that I can't quite remember exactly what happens during a failover.> and who knows what happens if you try and send messages to a dead node during that time.Depends how it died, which client API you're using and how the client is configured. Some combination of:* data loss if acking is disabled (hint: enable acking) * backpressure and errors in the client until new partition leaders kick in * client socket writes hanging "forever"If the latter is surprising: no SO_SNDTIMEO in pure Java blocking socket I/O. Think the new clients may address that, but not entirely sure.As an aside: can't emphasize enough how important it is to get your configuration right early. By the time you run into problems, it's often too late. Pay heed to any tuning guides you can find. Talk to Confluent if you're still unsure.> AND HAVE THEY OPEN SOURCED THIS MAGICAL SERVER? NO, THEY BLOODY HAVEN'T.<a href="https://github.com/confluentinc/kafka-rest" rel="nofollow">https://github.com/confluentinc/kafka-rest</a> this thing? FWIW, it's kind of a joke for high throughput anyway. Last time we spoke to Confluent they sort of discouraged its use for exactly that reason.Still, it's an easy bridge for folks who aren't too fussed about throughput. Not sure why you'd be using Kafka if throughput's not your thing, but y'know.> If you are using Java/Scala/Clojure/Kotlin/whatever and can use the Official Java Client then I'm sure Kafka is a perfectly reasonable choice for a message bus, although there are plenty of others that seem to me to be far less bloody-minded.Despite all the gotchas, Kafka's capable of pretty incredible throughput in a fault-tolerant HA configuration. I can empathize with some of the frustrations, but past a certain scale the proposed alternatives just aren't IMHO.

agounarisover 8 years ago

Why someone should be a fan of Kafka? its not the team of my town its a damn hammer.