Hadoop reaches 1.0 and my understanding of how to use it is still in development.<p>Does anyone have a high level resource of how MapReduce works for mediocre programmers like myself that are late to the game? I know she's not ready to have my babies, but surely I could get to know her a little, maybe just be friends? I grabbed a Hadoop pre-made virtual machine the other month and was surely so far over my head that I had to run away to regroup.<p>In general I have some very unoptimized problems that MapReduce probably isn't the right shoe for, but I'd love to explain to my boss <i>why</i> it's the wrong shoe. And learning about it might be a great start down that path.
Hadoop versioning has always been a little confusing to me:<p>* 0.23.0: 11 November, 2011<p>* 0.22.0: 10 December, 2011<p>Now we have 1.0, but it's based on 0.20, not any of the more recent releases?<p>The 1.0 release notes are pretty useless--it's just a list of issues. Is there a summary anywhere?
For those interested, here is a pretty good discussion on HN comparing different NoSQL DBs: <a href="http://news.ycombinator.com/item?id=2052852" rel="nofollow">http://news.ycombinator.com/item?id=2052852</a>
Awesome! Now if we can just get HBase to update it's prereqs and bump it's version, I can have some symmetry in my life!<p>On a more serious note - is anyone using HDFS for something like the WebHDFS stuff was designed? We're currently looking at HDFS right now for an Event Store mechanism, but it appears to me to be pretty large file / stream oriented, and I'm wondering how it will stack up if we want to do something that involves files much smaller than say, 64MB.
I agree with an earlier comment. Big Data, Hadoop etc. are keywords that are supposed to get big in 2012, however, as a regular web dev, it's hard for me to grasp what it can do unless you have gigantic data stores
Congrats on the milestone to those involved - it's great to have something like this available to everyone for free.<p>On a side note, and not to take anything away from the H-team, I'm pretty curious on how it compares to Google's GFS and the rest of their distributed computing stack (MR, Chubby, etc.). It would be sweet if Google released some or all of these some day.