We've sometimes used a hack that is a minor variant of the micro-batch processing in MapReduce. We: a) map the latest batch of data; b) in the reducer, join it with a cache from a previous reduction; and c) reduce in part, save a new cache, and proceed with further reductions. (We use our homegrown MapReduce implementation that allows multiple rounds of reduction and access to the filesystem, so I'm not sure this would work in Hadoop.)