TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Migrating From MongoDB To Riak At Bump

131 点作者 timdoug大约 13 年前

8 条评论

stephen大约 13 年前
&#62; During the migration, there were a number of fields that should have been set in Mongo but were not<p>Imagine that...this fascination with schema-less datastores just baffles me:<p><a href="http://draconianoverlord.com/2012/05/08/whats-wrong-with-a-schema.html" rel="nofollow">http://draconianoverlord.com/2012/05/08/whats-wrong-with-a-s...</a><p>I'm sure schema-less datastores are a huge win for your MVP release when it's all greenfield development, but from my days working for enterprises, it seems like you're just begging for data inconsistencies to sneak into your data.<p>Although, in the enterprise, data actually lives longer than 6 months--by which time I suppose most start ups are hoping to have been bought out.<p>(Yeah, I'm being snarky; none of this is targeted at bu.mp, they obviously understand pros/cons of schemas, having used pbuffers and mongo, I'm more just talking about how any datastore that's not relational these days touts the lack of a schema as an obvious win.)
评论 #3975188 未加载
评论 #3976936 未加载
评论 #3976193 未加载
salsakran大约 13 年前
I was reading along and nodding my head until I got to the 1000 line haskell program that handles issues stemming from a lack of consistency.<p>I'm not exactly a SQL fanboy, but maybe ACID is kinda useful in situations like this and having to write your own application land 1000 liners for stuff that got solved in SQL land decades ago isn't the best use of time?
评论 #3973794 未加载
评论 #3973699 未加载
评论 #3973948 未加载
评论 #3973765 未加载
评论 #3974055 未加载
timhaines大约 13 年前
If you're thinking about using Riak, make sure you benchmark the write (put) throughput for a sustained period before you start coding. I got burnt with this.<p>I was using the LevelDB backend with Riak 1.1.2, as my keys are too big to fit in RAM.<p>I ran tests on a 5 node dedicated server cluster (fast CPU, 8GB ram, 15k RPM spinning drives), and after 10 hours Riak was only able to write 250 new objects per second.<p>Here's a graph showing the drop from 400/s to 300/s: <a href="http://twitpic.com/9jtjmu/full" rel="nofollow">http://twitpic.com/9jtjmu/full</a><p>The tests were done using Basho's own benchmarking tool, with the partitioned sequential integer key generator, and 250 byte values. I tried adjusting the ring_size (1024 and 128), and tried adjusting the LevelDB cache_size etc and it didn't help.<p>Be aware of the poor write throughput if you are going to use it.
评论 #3974045 未加载
评论 #3974114 未加载
评论 #3973920 未加载
评论 #3973877 未加载
tlianza大约 13 年前
I find these kinds of stories interesting, but without some feel for the size of the data, they're not very useful/practical.<p>I've heard of Bump, and used it once or twice, but I don't actually know how big or popular it is. If we're talking about a database for a few million users, only a tiny percentage of which are actively "bumping" at any time, it's really hard for me to imagine this is an interesting scaling problem.<p>Ex. If I just read an article about a "data migration" who's scale is something a traditional DBMS would yawn at, the newsworthiness would have to be re-evaluated.
评论 #3974625 未加载
评论 #3974653 未加载
评论 #3974623 未加载
_Lemon_大约 13 年前
I have decided on wanting to use riak as well. I was wondering if anyone had examples of how they used it with their data model?<p>For example this article mentions "With appropriate logic (set unions, timestamps, etc) it is easy to resolve these conflicts" however timestamps are not an adequate way to do this due to distributed systems having partial ordering. The magicd may be serialising all requests to riak to mitigate this (essentially using the time reference of magicd) in which case they're losing out on the distributed nature of riak (magicd becomes a single point of failure / bottleneck).<p>Insight into how others have approached this would be awesome.
评论 #3973829 未加载
评论 #3974402 未加载
gizzlon大约 13 年前
Would be interesting to see a follow up in 6 months or so..<p>It doesn't seem fair to compare [<i>old tech</i>] with [<i>new tech</i>] when you've felt all the pitfalls with one but not the other.
supo大约 13 年前
Random thought on proto buffers: OP is advocating using the "required" modifier for fields and touting it as an advantage in comparison to JSON. I would move the field value verification logic to the client, because it can cause backwards compatibility problems if you un-require it.
clu3大约 13 年前
@timdoug, could you share specific problems with Mongo that made|forced you switch to Riak please? "Operational qualities" are little vague
评论 #3974689 未加载