TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Scaling PageRank with R on Rescale

41 点作者 gpoort超过 11 年前

3 条评论

srean超过 11 年前
These days a cheap way to get attention seems to be to Hadoop&#x27;ize a well known but expensive operation. &quot;Bogo-sort too slow ? no worries, we will run it on the cloud.&quot; These approaches are buzzword compatible. They keep the marketing guys happy. Managers do not get fired for embracing the cloud&#x2F;Hadoop, even if it is done in a grossly inefficient way.<p>I realize I bring unwelcome rain, but the post really begs the question if using R here is the best use of resources: technological and human
评论 #7100779 未加载
etrain超过 11 年前
If you want to take a look at scaling page rank on commodity clusters with R, you should take a look at the newly released SparkR (<a href="http://amplab-extras.github.io/SparkR-pkg/" rel="nofollow">http:&#x2F;&#x2F;amplab-extras.github.io&#x2F;SparkR-pkg&#x2F;</a>) and possibly just call into GraphX (<a href="https://github.com/amplab/graphx" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;amplab&#x2F;graphx</a>).
评论 #7100788 未加载
greglindahl超过 11 年前
I wonder why they call the algorithm PageRank when their example is academic citations? The academic citation index existed long before PageRank took that concept to the Web.<p>People even gamed it the same way (citation clubs == linkbuilding networks.)