TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

139 点作者 telekid将近 6 年前

13 条评论

mayank将近 6 年前
Note: this is not about replacing ZooKeeper in general with Kafka as the title might suggest, it&#x27;s about Kafka considering an alternative to its internal use of ZooKeeper:<p>&quot;Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. We would like to remove this dependency on ZooKeeper. This will enable us to manage metadata in a more scalable and robust way, enabling support for more partitions. It will also simplify the deployment and configuration of Kafka&quot;
评论 #20596463 未加载
kevsim将近 6 年前
In the days before hosted Kafka implementations were readily available, I tried to set up Kafka in our AWS infrastructure. Getting Zookeeper working in a world based on autoscaling groups was a nightmare. It felt like it was built for the days where each server was a special snowflake.<p>Looking forward to seeing if this gains traction.
评论 #20597070 未加载
评论 #20596067 未加载
评论 #20598414 未加载
评论 #20596129 未加载
评论 #20599237 未加载
mykowebhn将近 6 年前
Isn&#x27;t managing consensus extremely hard to do? Wouldn&#x27;t one want to rely on a proven solution rather than spinning up a new solution?
评论 #20595917 未加载
评论 #20595889 未加载
评论 #20595980 未加载
评论 #20595987 未加载
评论 #20596030 未加载
评论 #20596063 未加载
评论 #20596087 未加载
评论 #20595888 未加载
评论 #20599248 未加载
评论 #20595907 未加载
评论 #20595892 未加载
评论 #20595886 未加载
noahdesu将近 6 年前
This sounds really similar to some of the work coming out of Vectorized on Redpanda [0]. They&#x27;re building a Kafka API compatible system in C++ that&#x27;s apparently achieving significant performance gains (throughput, tail latency) while maintaining operational simplicity.<p>[0]: <a href="https:&#x2F;&#x2F;vectorized.io&#x2F;redpanda" rel="nofollow">https:&#x2F;&#x2F;vectorized.io&#x2F;redpanda</a>
评论 #20600246 未加载
linuxhansl将近 6 年前
I don&#x27;t quite understand why everybody and their mother are trying to remove Zookeeper from their setup.<p>In my past I&#x27;ve seen this many times and each time people went back to Zookeeper after a while, because - as it turns out - consensus is hard; and Zookeeper is battle hardened.
评论 #20599653 未加载
simtel20将近 6 年前
Is no-one else running into FastLeaderElectionFailed? When you have a system that writes a lot of offset&#x2F;transaction info to zookeeper you can push the zxid 32-bit counter to rollover in a matter of days. When this happens it can bring zookeeper to a grinding halt for 15 minutes after 2 nodes try to nominate themselves for leadership and the rest of the cluster sits back and waits for a timeout.<p><a href="https:&#x2F;&#x2F;issues.apache.org&#x2F;jira&#x2F;browse&#x2F;ZOOKEEPER-2164" rel="nofollow">https:&#x2F;&#x2F;issues.apache.org&#x2F;jira&#x2F;browse&#x2F;ZOOKEEPER-2164</a><p><a href="https:&#x2F;&#x2F;issues.apache.org&#x2F;jira&#x2F;browse&#x2F;ZOOKEEPER-2791" rel="nofollow">https:&#x2F;&#x2F;issues.apache.org&#x2F;jira&#x2F;browse&#x2F;ZOOKEEPER-2791</a><p>Requests (can&#x27;t find them in JIRA at the moment, so I need to paraphrase) in the past to have a call to initiate a controlled leadership move to another node have been turned down as &quot;you don&#x27;t need this&quot; yet leadership election fails in some circumstances! In addition there&#x27;s no command or configuration to disable FastLeaderElection.<p>So the zookeeper maintainers keep operators limited to having to flip nodes off and on again, which is really a bad way to manage software because it impacts clients as well as leadership (and even if clients recover, most code that I&#x27;ve seen like to make some noise when zk connections flap). I would really like to eliminate all use cases for zookeeper where there is a chance that the zxid will exceed the size of its 32-bit counter component in the span of, say, a decade so that as an operator I don&#x27;t have to set alerts on the zxid counter creeping up, and having to reset zookeeper and restart all of its clients (many versions of many zookeeper clients don&#x27;t retry after connection loss, don&#x27;t retry after a timeout, don&#x27;t cope with the primary connection failing, will have totally given up after 15 minutes, etc.).<p>I think that the kafka maintainers have been doing a better job of actively maintaining their code and ensuring it works in adverse conditions, so I&#x27;m on board with this proposal.<p>Zookeeper isn&#x27;t magic, it&#x27;s just pretty good at most of what it does, and I think that projects that understand when they&#x27;ve pushed zookeeper into a bad corner may benefit from this kind of move, if they also have a good idea of how they can do better.
james-mcelwain将近 6 年前
There&#x27;s a toy Kafka implementation written in Go that attempts to do this: <a href="https:&#x2F;&#x2F;github.com&#x2F;travisjeffery&#x2F;jocko" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;travisjeffery&#x2F;jocko</a><p>Previous HN discussion: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=13449728" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=13449728</a>
评论 #20596147 未加载
MrBuddyCasino将近 6 年前
To all the people wondering why &quot;replace battle tested ZK&quot; and how &quot;consensus is hard&quot;: its right there under the motivation header:<p>&gt; enabling support for more partitions<p>I don&#x27;t know if anyone of you ever ran a high-throughput Kafka cluster with a large number of partitions (as in, thousands of them), but its not pretty. Rebalancing can easily take half an hour after a rollout, and throughput is degraded during that time. We recently had to move to shared topics because it became untenable.<p>This is a very welcome change!
camiloaguilar将近 6 年前
I’m not too convinced of the approach. I’ve been anxiously waiting for <a href="https:&#x2F;&#x2F;vectorized.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;vectorized.io&#x2F;</a> to release their message queue. It is built in modern C++, uses ScyllaDB Seastar framework to do IO scheduling in userspace, with better mechanical sympathy. And like Hashicorp’s Nomad and Vault, which I’m a fan of, it has built-in distributed consensus and easy operation.
rhacker将近 6 年前
It would be nice if all the cloud vendors agreed on a key&#x2F;value and&#x2F;or consensus protocol that all servers in a cluster can connect to - and maybe even supported via docker, natively even if there&#x27;s just one cluster member. Like plug-n-play for clustering tech. (Bonjour basically but suitable for cloud&#x2F;enterprise software)
ccleve将近 6 年前
I have written a Raft implementation in Java. If anyone from the Kafka project wants it, please contact me. It&#x27;s not open source, but I own it and could make it so.
评论 #20596669 未加载
PunksATawnyFill将近 6 年前
Whoever wrote this doesn&#x27;t know WTF a quorum is.
评论 #20599694 未加载
superapc将近 6 年前
we (alluxio.io) have gone through a similar process by replacing Zookeeper with CopyCat (a raft implementation) for both leader election and storing shared journal since Alluxio 2.0 . Works pretty well