TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Replica Strategy in Hdfs Is Not Good Enough

2 pointsby garfeeover 11 years ago

1 comment

brugidouover 11 years ago
Comparing to mongodb is a joke.<p>However some more advanced strategies should be applied for very large hdfs clusters. The rack aware strategy is actually better than what is described because the probability distribution is not perfectly uniform. It all depends on the hardware, the location... Etc. But with a very large number of blocks the probability of loosing data with 3 nodes failure is close to 1 unfortunately.<p>We could try to imagine a better strategy having replicas in cliques of nodes to mitigate the risk. Its a tradeoff of loosing more data with less probability or less data with high probability I guess? Haven&#x27;t done the math :)