科技回声

lmm超过 11 年前

I like Spark over Hadoop just from an interface point of view, particularly the ability to just start up a (Scala) shell and start playing around. Hadoop can be very effective, but even getting "hello world" to run requires an intimidating array of setup.

评论 #7003574 未加载

评论 #7003724 未加载

评论 #7003777 未加载

hobbyist超过 11 年前

I often read that spark avoids the costly synchronization required in mapreduce, since it uses DAG's. Can someone explain how is that achieved. If the application so demands that you can launch jobs together, that can be done even with hadoop/mapreduce. If one job requires the output of another, then the job has to wait for synchronization whether its mapreduce or DAG.

评论 #7003924 未加载

评论 #7003922 未加载

justinkestelyn超过 11 年前

Some interesting use cases are also described on Cloudera's developer blog, at <a href="http://blog.cloudera.com/blog/2013/11/putting-spark-to-use-fast-in-memory-computing-for-your-big-data-applications/" rel="nofollow">http://blog.cloudera.com/blog/2013/11/putting-spark-to-use-f...</a>.

fintler超过 11 年前

Although spark is nice, I'm also looking forward to mpi/orted integration with hadoop...<p>"Performance: Launches ~1000x faster, runs ~10x faster"<p>"Launch scaling: Hadoop (~N), MR+ (~logN)"<p>"Wireup: Hadoop (~N2), MR+ (~logN)"<p><a href="http://slurm.schedmd.com/slurm_ug_2012/MapRedSLURM.pdf" rel="nofollow">http://slurm.schedmd.com/slurm_ug_2012/MapRedSLURM.pdf</a>

wheaties超过 11 年前

What I would love to know is if Mahout works out of the box with Spark or if there's a third party library that bridges the two.

评论 #7004437 未加载

MapReduce and Spark

5 条评论

MapReduce and Spark

5 条评论