TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Crowdlab: Effective algorithms to handle data labeled by multiple annotators

2 点作者 anishathalye超过 2 年前

1 comment

anishathalye超过 2 年前
Many real-world datasets use multiple annotations per example to ensure higher-quality labels. CROWDLAB is a new set of algorithms that estimate 3 key quantities better than prior standard crowdsourcing algorithms like GLAD and Dawid-Skene: (1) a consensus label per example, (2) a confidence score for the correctness of the consensus label, and (3) a rating for each annotator.<p>The blog post gives some intuition for how it works, along with some benchmarking results, and the math and the nitty-gritty details can be found in this paper: <a href="https:&#x2F;&#x2F;cleanlab.github.io&#x2F;multiannotator-benchmarks&#x2F;paper.pdf" rel="nofollow">https:&#x2F;&#x2F;cleanlab.github.io&#x2F;multiannotator-benchmarks&#x2F;paper.p...</a><p>Happy to answer any questions related to multi-annotator datasets or data-centric approaches to ML in general here.