TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Migrating From MongoDB To Riak At Bump

131 pointsby timdougabout 13 years ago

8 comments

stephenabout 13 years ago
&#62; During the migration, there were a number of fields that should have been set in Mongo but were not<p>Imagine that...this fascination with schema-less datastores just baffles me:<p><a href="http://draconianoverlord.com/2012/05/08/whats-wrong-with-a-schema.html" rel="nofollow">http://draconianoverlord.com/2012/05/08/whats-wrong-with-a-s...</a><p>I'm sure schema-less datastores are a huge win for your MVP release when it's all greenfield development, but from my days working for enterprises, it seems like you're just begging for data inconsistencies to sneak into your data.<p>Although, in the enterprise, data actually lives longer than 6 months--by which time I suppose most start ups are hoping to have been bought out.<p>(Yeah, I'm being snarky; none of this is targeted at bu.mp, they obviously understand pros/cons of schemas, having used pbuffers and mongo, I'm more just talking about how any datastore that's not relational these days touts the lack of a schema as an obvious win.)
评论 #3975188 未加载
评论 #3976936 未加载
评论 #3976193 未加载
salsakranabout 13 years ago
I was reading along and nodding my head until I got to the 1000 line haskell program that handles issues stemming from a lack of consistency.<p>I'm not exactly a SQL fanboy, but maybe ACID is kinda useful in situations like this and having to write your own application land 1000 liners for stuff that got solved in SQL land decades ago isn't the best use of time?
评论 #3973794 未加载
评论 #3973699 未加载
评论 #3973948 未加载
评论 #3973765 未加载
评论 #3974055 未加载
timhainesabout 13 years ago
If you're thinking about using Riak, make sure you benchmark the write (put) throughput for a sustained period before you start coding. I got burnt with this.<p>I was using the LevelDB backend with Riak 1.1.2, as my keys are too big to fit in RAM.<p>I ran tests on a 5 node dedicated server cluster (fast CPU, 8GB ram, 15k RPM spinning drives), and after 10 hours Riak was only able to write 250 new objects per second.<p>Here's a graph showing the drop from 400/s to 300/s: <a href="http://twitpic.com/9jtjmu/full" rel="nofollow">http://twitpic.com/9jtjmu/full</a><p>The tests were done using Basho's own benchmarking tool, with the partitioned sequential integer key generator, and 250 byte values. I tried adjusting the ring_size (1024 and 128), and tried adjusting the LevelDB cache_size etc and it didn't help.<p>Be aware of the poor write throughput if you are going to use it.
评论 #3974045 未加载
评论 #3974114 未加载
评论 #3973920 未加载
评论 #3973877 未加载
tlianzaabout 13 years ago
I find these kinds of stories interesting, but without some feel for the size of the data, they're not very useful/practical.<p>I've heard of Bump, and used it once or twice, but I don't actually know how big or popular it is. If we're talking about a database for a few million users, only a tiny percentage of which are actively "bumping" at any time, it's really hard for me to imagine this is an interesting scaling problem.<p>Ex. If I just read an article about a "data migration" who's scale is something a traditional DBMS would yawn at, the newsworthiness would have to be re-evaluated.
评论 #3974625 未加载
评论 #3974653 未加载
评论 #3974623 未加载
_Lemon_about 13 years ago
I have decided on wanting to use riak as well. I was wondering if anyone had examples of how they used it with their data model?<p>For example this article mentions "With appropriate logic (set unions, timestamps, etc) it is easy to resolve these conflicts" however timestamps are not an adequate way to do this due to distributed systems having partial ordering. The magicd may be serialising all requests to riak to mitigate this (essentially using the time reference of magicd) in which case they're losing out on the distributed nature of riak (magicd becomes a single point of failure / bottleneck).<p>Insight into how others have approached this would be awesome.
评论 #3973829 未加载
评论 #3974402 未加载
gizzlonabout 13 years ago
Would be interesting to see a follow up in 6 months or so..<p>It doesn't seem fair to compare [<i>old tech</i>] with [<i>new tech</i>] when you've felt all the pitfalls with one but not the other.
supoabout 13 years ago
Random thought on proto buffers: OP is advocating using the "required" modifier for fields and touting it as an advantage in comparison to JSON. I would move the field value verification logic to the client, because it can cause backwards compatibility problems if you un-require it.
clu3about 13 years ago
@timdoug, could you share specific problems with Mongo that made|forced you switch to Riak please? "Operational qualities" are little vague
评论 #3974689 未加载