TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

No News Is Good News: A Critique of the One Billion Word Benchmark

15 点作者 CarrieLab超过 3 年前

1 comment

Reuzel超过 3 年前
No Paper Is Good Paper: A Critique of Long Titles<p>The Arxiv One Billion Paper Benchmark was released in 2011, and is commonly used as a benchmark to writing academic papers. Analysis of this dataset shows that it contains several examples of sarcastic papers, as well as outdated references to current events, such as Support Vectors Machines. We suggest that the temporal nature of science makes this benchmark poorly suited to writing academic papers, and discuss potential impact and considerations for researchers building language models and evaluation datasets.<p>Conclusions<p>Papers written on top of other papers snap-shotted in time will display the inherent social bias and structural issues of that time. Therefore, people creating and using benchmarks, should realize that such a thing as drift exists, and we suggest they find ways around this. We encourage other paper writers to actively avoid using benchmarks where the training samples are always the same. This is a poor way to measure perplexity of language models and science. For better comparison, we suggest the training samples always change to reflect the current anti-bias Zeitgeist and that you cite our paper when doing so.