TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Large Scale Distributed Deep Learning on Hadoop Clusters

42 点作者 cjdulberger超过 9 年前

1 comment

duggan超过 9 年前
Both this and Twitter Engineering&#x27;s recent post[1] on HDFS make me wonder whether HDFS is something a team would reach for in 2015.<p>I&#x27;m starting to read into the technologies in this area (i.e., I have not used much of the Hadoop stack yet), and I haven&#x27;t found a fundamental reason why one would not base their batch processing on S3 (or your object store of choice). Existing software appears to make assumptions about the storage medium being a local hard drive.<p>Much of the challenge of HDFS appears to be around scaling the NameNode, and provisioning capacity. S3 dispenses with these issues, and the only cost appears to be throughput.<p>If software like Spark was modified to have a much more native approach to S3, could HDFS be dispensed with entirely?<p>[1] <a href="https:&#x2F;&#x2F;blog.twitter.com&#x2F;2015&#x2F;hadoop-filesystem-at-twitter" rel="nofollow">https:&#x2F;&#x2F;blog.twitter.com&#x2F;2015&#x2F;hadoop-filesystem-at-twitter</a>
评论 #10420892 未加载
评论 #10420107 未加载
评论 #10420496 未加载
评论 #10420032 未加载
评论 #10420215 未加载
评论 #10420100 未加载