TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Decoding the ACL Paper: Gzip and KNN Rival Bert in Text Classification

34 点作者 abhi9u将近 2 年前

3 条评论

liliumregale将近 2 年前
The paper has recently been called into question for overestimating their performance relative to BERT: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36758433">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36758433</a>. Might be good for the blog&#x27;s author to take this into account in their explainer. The author&#x27;s perspective sounds a bit too positive (and borderline salesmanlike).
评论 #36807850 未加载
评论 #36809925 未加载
numeri将近 2 年前
In addition to the evaluation issues, it looks like several of their test sets have significant overlap with the test sets [1]. Especially for a compression-based technique, having exact duplicates is going to help a lot.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;bazingagin&#x2F;npc_gzip&#x2F;issues&#x2F;13">https:&#x2F;&#x2F;github.com&#x2F;bazingagin&#x2F;npc_gzip&#x2F;issues&#x2F;13</a>
stri8ed将近 2 年前
In such a scheme, wouldn&#x27;t synonyms of the same word be no closer to each other, than any other random string?
评论 #36809984 未加载