TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Getting Creative with MapReduce

58 pointsby sritchieover 13 years ago

3 comments

moultanoover 13 years ago
My approach to this is to take the complicated bits of the mapreduce and put them in a separate class. Then I do a combination of two things as appropriate:<p>1. Hook it up to a debug server that fetches from the same datastore as the mapreduce, then test it on some keys that I'm interested in.<p>2. Test it like any other class.<p>The only awkward part of this is abstracting out the output calls, which I usually do by passing in a "handle some data" callback that outputs in the mapreduce and dumps some pretty html in the debug server.<p>The great part about this is that if the mapreduce ends up being something important, you already have the tools to introspect its internals on data you are interested in.
评论 #3058199 未加载
epennover 13 years ago
<i>The Cascalog abstraction layer fixes this issue by separating logic from data, allowing you to play creatively at massive scale.</i><p>I just checked out Casacalog and I like what I see, although I have yet to try it out myself. Does anyone know of something similar that would work with Scala as well?
mitultiwariover 13 years ago
Nice.<p>Check out another similar clojure library called "MR-Kluj" that you can use to write Hadoop MapReduce jobs in Clojure: <a href="https://github.com/cheddar/mr-kluj" rel="nofollow">https://github.com/cheddar/mr-kluj</a>