TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Taming High Cardinality by sharding a stream

27 pointsby trojanalertalmost 2 years ago

1 comment

thameralmost 2 years ago
If you&#x27;re considering sharding a database, please spend some time finding the best key distribution strategy, and don&#x27;t just use `key % shard_count` as if it was automatically the right way to do it. The distribution of values for the left side of this mod operator will not necessarily lead to equal distribution over the shards.<p>Some will add a hash function around the key, but this only addresses part of the problem: for example if you started with N shards and ever need to add 1, you will need to move all but O(1&#x2F;N) keys to new shards. And it&#x27;s not just about growing the number of shards permanently, other maintenance operations such as replacing a host can require you to redistribute the data depending on how replication is set up.<p>Consistent hashing can often help drastically reduce the number of keys to shuffle, but in any case it&#x27;s something worth spending some time on early on and getting right rather than having to pay later for having overlooked its impact.
评论 #37217878 未加载