TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to process a million songs in 20 minutes

141 点作者 brianwhitman超过 13 年前

4 条评论

joebo超过 13 年前
It would be interesting to know how long it takes to run locally on a single instance
评论 #2967137 未加载
KirinDave超过 13 年前
So... MapReduce? Kinda figured that when you had 6 zeros after your first zero.<p>This looks like a fun project, but I can't help but feel like Hadoop experience reports are a little late to the party at this point. Is there anyone out there who doesn't immediately think MapReduce when they see numbers at scale like this? If anything, the tool is <i>overused</i>, not neglected.
评论 #2966316 未加载
评论 #2965966 未加载
dvcat超过 13 年前
Can anyone clarify if song data is dense? If it is dense, I am not even sure if Mapreduce is the right paradigm to use mainly because you will eventually get to a situation where transfer time overwhelms compute time.
评论 #2966781 未加载
revorad超过 13 年前
Can you dynamically adjust the number of EC2 instances to optimise for processing time or price?
评论 #2965858 未加载