TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Storm : a Realtime Computation System Similar to Hadoop

30 pointsby EzGraphsover 12 years ago

6 comments

terhechteover 12 years ago
Previous Discussion: <a href="http://news.ycombinator.com/item?id=3014039" rel="nofollow">http://news.ycombinator.com/item?id=3014039</a>
Xorlevover 12 years ago
It's barely similar, it's a fault-tolerant system for scaling computation. Storm provides real-time streaming computation. Your spouts provide infinite streams of tuples, small objects which store serialized other types that you then emit 0 or more tuples out of that tuple.<p>You could liken it to a streaming mapreduce that you can rearrange into directed graphs of data flows called a topology.<p>Re: Spark, it's a totally different paradigm that's like a map reduce which takes advantage of memory locality where Hadoop takes advantage of disk locality. Hive on Spark is a pretty beastly system.
_jmar777over 12 years ago
Storm is actually not similar to Hadoop at all. I think this title resulted from a misreading of the README, which states: "Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation."<p>/nitpick
dkhenryover 12 years ago
So which is better Storm[1] or Spark[2] ?<p>1. <a href="http://storm-project.net/" rel="nofollow">http://storm-project.net/</a> 2. <a href="http://www.spark-project.org/" rel="nofollow">http://www.spark-project.org/</a>
评论 #4683086 未加载
EzGraphsover 12 years ago
JRuby DSL and Integration for Storm here: <a href="https://github.com/colinsurprenant/redstorm" rel="nofollow">https://github.com/colinsurprenant/redstorm</a>
t-crayfordover 12 years ago
How is this better than Hadoop?
评论 #4683377 未加载