TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Sharding Techniques at Mixpanel Engineering

75 pointsby suhailabout 14 years ago

3 comments

fleitzabout 14 years ago
I'm surprised no one has considered using CIDR notation for sharding. If you SHA-256 the key you've got 256-bits of address space. You could store the whole table very compactly and there are plenty of algorithms out there for making the lookup VERY fast.<p>You could use DNS to transmit the server info eg. do a lookup on a4.c7.3a.d4.ec.1f.5d.f4.c6.f6.5f.77.f3.71.f5.3d.fa.e8.ea.15.cluster.foo.com<p>In your DNS server just have a record for *.15.cluster.foo.com this effectively gives you a /8 on a 256-bit routing space. To find all the servers you need to query to query the entire keyspace just ask for an AXFR request on cluster.foo.com. This system also makes mirroring and partitioning extremely simple. Say you have disparate servers that can handle disparate amounts of data? Just turn your smaller servers into /9s<p>Note if you're considering implementing this there are some patent applications around it. Look in my profile if you want to code around the patent application.
andrew311about 14 years ago
This post is a great 101, but it doesn't provide much info on what kind of setup you run at Mixpanel. I see you ditched Cassandra, though, and you mention Riak. Are you using Riak, rolling your own sharding layer, or what?
akronimabout 14 years ago
Step 0 - have enough traction that you need to worry about sharding.
评论 #2539851 未加载
评论 #2539666 未加载