TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Large-Scale Machine Learning with Spark on Amazon EMR

47 点作者 eaxitect将近 10 年前

2 条评论

MoOmer将近 10 年前
I do the same on Google Compute Engine, except without the auto-terminate and scaling :(<p>However, Google&#x27;s bdutil has a great set of shell scripts which auto setup the environment; and, with minimal changes you can set up the exact Scala&#x2F;Spark versions you need.<p>The fact that I (just one dude) can set up a pipeline and chomp through TBs of data on clusters with TBs of memory over the course of hours still keeps me in awe of the advances of both GCE and AWS.<p>I&#x27;ll have to give EMR&#x2F;AWS a shot!
评论 #10095661 未加载
jeffreysmith将近 10 年前
Jeff here. Glad that people are interested in this post. Feel free to ping me with any questions.
评论 #10093397 未加载
评论 #10093151 未加载