TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Evaluating Bing with Mechanical Turk

43 点作者 lukas将近 16 年前

4 条评论

mikeliu将近 16 年前
As the article concluded, the difference was small (to me they're pretty much insignificant). It doesn't really matter to me if the target is the first link, as long as it's on the first page.<p>I think better comparisons would be who has the better infrastructure, who can deliver results faster, serve more queries, use less energy, and crawl faster? I think google is hard to beat here, and i think it's where the others have to think harder about.
whughes将近 16 年前
To me, the most interesting aspect of the post was the comparison between Bing and the old Live. I'd have liked to see a Live-Google comparison and perhaps Yahoo! as well, since it's often thrown around as a comparable alternative.<p>This seems to confirm my idea that Bing is less a revamp to the search engine and more a rebranding for Microsoft. Bing is certainly more memorable than Live or MSN were, and it's replacing Live and other MS brands in several non-search areas (Virtual Earth -&#62; Bing Maps for Enterprise, for example). That's certainly nothing new for Microsoft, but this time they seem to be marketing it as a search engine change to get people to try the engine and hopefully switch from the big G.
评论 #650652 未加载
shalmanese将近 16 年前
I thought this was an interesting study but, after spending a few minutes trying to find patterns in the data, I finally hacked up a quick null hypothesis graph in Excel and it looked virtually indistinguishable:<p><a href="http://blog.figuringshitout.com/another-way-to-lie-with-statistics" rel="nofollow">http://blog.figuringshitout.com/another-way-to-lie-with-stat...</a>
评论 #651062 未加载
评论 #651375 未加载
gojomo将近 16 年前
I trust their statistical significance calculations... but at a glance, the distribution barely looks different from what I'd expect if the turkers picked at random (either on purpose or because search tastes are at some point arbitrary).<p>It'd be interesting to include such a graph, where all ratings are drawn at random (but with the same slightly-vs.-much proportions) for visual comparison.<p>Also: what would happen if all the individual queries on which the preference isn't statistically significant were discarded, or repeated until the preference becomes significant? (Or is "6-8 workers" enough for significance?)
评论 #650749 未加载