科技回声

13 条评论

mayank将近 6 年前

Note: this is not about replacing ZooKeeper in general with Kafka as the title might suggest, it's about Kafka considering an alternative to its internal use of ZooKeeper:"Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. We would like to remove this dependency on ZooKeeper. This will enable us to manage metadata in a more scalable and robust way, enabling support for more partitions. It will also simplify the deployment and configuration of Kafka"

评论 #20596463 未加载

kevsim将近 6 年前

In the days before hosted Kafka implementations were readily available, I tried to set up Kafka in our AWS infrastructure. Getting Zookeeper working in a world based on autoscaling groups was a nightmare. It felt like it was built for the days where each server was a special snowflake.Looking forward to seeing if this gains traction.

评论 #20597070 未加载

评论 #20596067 未加载

评论 #20598414 未加载

评论 #20596129 未加载

评论 #20599237 未加载

mykowebhn将近 6 年前

Isn't managing consensus extremely hard to do? Wouldn't one want to rely on a proven solution rather than spinning up a new solution?

评论 #20595917 未加载

评论 #20595889 未加载

评论 #20595980 未加载

评论 #20595987 未加载

评论 #20596030 未加载

评论 #20596063 未加载

评论 #20596087 未加载

评论 #20595888 未加载

评论 #20599248 未加载

评论 #20595907 未加载

评论 #20595892 未加载

评论 #20595886 未加载

noahdesu将近 6 年前

This sounds really similar to some of the work coming out of Vectorized on Redpanda [0]. They're building a Kafka API compatible system in C++ that's apparently achieving significant performance gains (throughput, tail latency) while maintaining operational simplicity.[0]: <a href="https://vectorized.io/redpanda" rel="nofollow">https://vectorized.io/redpanda</a>

评论 #20600246 未加载

linuxhansl将近 6 年前

I don't quite understand why everybody and their mother are trying to remove Zookeeper from their setup.In my past I've seen this many times and each time people went back to Zookeeper after a while, because - as it turns out - consensus is hard; and Zookeeper is battle hardened.

评论 #20599653 未加载

simtel20将近 6 年前

Is no-one else running into FastLeaderElectionFailed? When you have a system that writes a lot of offset/transaction info to zookeeper you can push the zxid 32-bit counter to rollover in a matter of days. When this happens it can bring zookeeper to a grinding halt for 15 minutes after 2 nodes try to nominate themselves for leadership and the rest of the cluster sits back and waits for a timeout.<a href="https://issues.apache.org/jira/browse/ZOOKEEPER-2164" rel="nofollow">https://issues.apache.org/jira/browse/ZOOKEEPER-2164</a><a href="https://issues.apache.org/jira/browse/ZOOKEEPER-2791" rel="nofollow">https://issues.apache.org/jira/browse/ZOOKEEPER-2791</a>Requests (can't find them in JIRA at the moment, so I need to paraphrase) in the past to have a call to initiate a controlled leadership move to another node have been turned down as "you don't need this" yet leadership election fails in some circumstances! In addition there's no command or configuration to disable FastLeaderElection.So the zookeeper maintainers keep operators limited to having to flip nodes off and on again, which is really a bad way to manage software because it impacts clients as well as leadership (and even if clients recover, most code that I've seen like to make some noise when zk connections flap). I would really like to eliminate all use cases for zookeeper where there is a chance that the zxid will exceed the size of its 32-bit counter component in the span of, say, a decade so that as an operator I don't have to set alerts on the zxid counter creeping up, and having to reset zookeeper and restart all of its clients (many versions of many zookeeper clients don't retry after connection loss, don't retry after a timeout, don't cope with the primary connection failing, will have totally given up after 15 minutes, etc.).I think that the kafka maintainers have been doing a better job of actively maintaining their code and ensuring it works in adverse conditions, so I'm on board with this proposal.Zookeeper isn't magic, it's just pretty good at most of what it does, and I think that projects that understand when they've pushed zookeeper into a bad corner may benefit from this kind of move, if they also have a good idea of how they can do better.

james-mcelwain将近 6 年前

There's a toy Kafka implementation written in Go that attempts to do this: <a href="https://github.com/travisjeffery/jocko" rel="nofollow">https://github.com/travisjeffery/jocko</a>Previous HN discussion: <a href="https://news.ycombinator.com/item?id=13449728" rel="nofollow">https://news.ycombinator.com/item?id=13449728</a>

评论 #20596147 未加载

MrBuddyCasino将近 6 年前

To all the people wondering why "replace battle tested ZK" and how "consensus is hard": its right there under the motivation header:> enabling support for more partitionsI don't know if anyone of you ever ran a high-throughput Kafka cluster with a large number of partitions (as in, thousands of them), but its not pretty. Rebalancing can easily take half an hour after a rollout, and throughput is degraded during that time. We recently had to move to shared topics because it became untenable.This is a very welcome change!

camiloaguilar将近 6 年前

I’m not too convinced of the approach. I’ve been anxiously waiting for <a href="https://vectorized.io/" rel="nofollow">https://vectorized.io/</a> to release their message queue. It is built in modern C++, uses ScyllaDB Seastar framework to do IO scheduling in userspace, with better mechanical sympathy. And like Hashicorp’s Nomad and Vault, which I’m a fan of, it has built-in distributed consensus and easy operation.

rhacker将近 6 年前

It would be nice if all the cloud vendors agreed on a key/value and/or consensus protocol that all servers in a cluster can connect to - and maybe even supported via docker, natively even if there's just one cluster member. Like plug-n-play for clustering tech. (Bonjour basically but suitable for cloud/enterprise software)

ccleve将近 6 年前

I have written a Raft implementation in Java. If anyone from the Kafka project wants it, please contact me. It's not open source, but I own it and could make it so.

评论 #20596669 未加载

PunksATawnyFill将近 6 年前

Whoever wrote this doesn't know WTF a quorum is.

评论 #20599694 未加载

superapc将近 6 年前

we (alluxio.io) have gone through a similar process by replacing Zookeeper with CopyCat (a raft implementation) for both leader election and storing shared journal since Alluxio 2.0 . Works pretty well

13 条评论

mayank将近 6 年前

评论 #20596463 未加载

kevsim将近 6 年前

评论 #20597070 未加载

评论 #20596067 未加载

评论 #20598414 未加载

评论 #20596129 未加载

评论 #20599237 未加载

mykowebhn将近 6 年前

Isn't managing consensus extremely hard to do? Wouldn't one want to rely on a proven solution rather than spinning up a new solution?

评论 #20595917 未加载

评论 #20595889 未加载

评论 #20595980 未加载

评论 #20595987 未加载

评论 #20596030 未加载

评论 #20596063 未加载

评论 #20596087 未加载

评论 #20595888 未加载

评论 #20599248 未加载

评论 #20595907 未加载

评论 #20595892 未加载

评论 #20595886 未加载

noahdesu将近 6 年前

评论 #20600246 未加载

linuxhansl将近 6 年前

评论 #20599653 未加载

simtel20将近 6 年前

james-mcelwain将近 6 年前

评论 #20596147 未加载

MrBuddyCasino将近 6 年前

camiloaguilar将近 6 年前

rhacker将近 6 年前

ccleve将近 6 年前

I have written a Raft implementation in Java. If anyone from the Kafka project wants it, please contact me. It's not open source, but I own it and could make it so.

KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

13 条评论

KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

13 条评论