TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Taming High Cardinality by sharding a stream

27 点作者 trojanalert将近 2 年前

1 comment

thamer将近 2 年前
If you&#x27;re considering sharding a database, please spend some time finding the best key distribution strategy, and don&#x27;t just use `key % shard_count` as if it was automatically the right way to do it. The distribution of values for the left side of this mod operator will not necessarily lead to equal distribution over the shards.<p>Some will add a hash function around the key, but this only addresses part of the problem: for example if you started with N shards and ever need to add 1, you will need to move all but O(1&#x2F;N) keys to new shards. And it&#x27;s not just about growing the number of shards permanently, other maintenance operations such as replacing a host can require you to redistribute the data depending on how replication is set up.<p>Consistent hashing can often help drastically reduce the number of keys to shuffle, but in any case it&#x27;s something worth spending some time on early on and getting right rather than having to pay later for having overlooked its impact.
评论 #37217878 未加载