TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How Does Garbage in Garbage Out Apply to LLMs?

3 点作者 shaburn将近 2 年前

5 条评论

brucethemoose2将近 2 年前
Set accuracy aside for a moment.<p>There is an opportunity cost for stuffing garbage into an model&#x27;s limited parameter count. Every SEO bot article, angry tweet, or off topic ingestion (like hair product comparisons or neutron star descriptions in your code completion llm) takes up &quot;space&quot; that could instead be taken up by a textbook, classic literature or whatever.<p>Generative AI works pretty well <i>in spite</i> of this garbage because of the diamonds in the rough. But I am certain the lack of curation and specialization leaves a ton of efficiency&#x2F;quality left on the table.
PaulHoule将近 2 年前
It learns to imitate what it is shown so if you show it text from StackOverflow it will learn the wrong answers as well as the right answers unless you are really good about filtering out the wrong answers.
jstx1将近 2 年前
1. it matters what training data the creators of the LLM use<p>2. the step of reinforcement learning with human feedback is important<p>3. as a user you need to ask questions well and know how to prompt it to get the best results
compressedgas将近 2 年前
Yes, GIGO even applies to humans.
rolph将近 2 年前
train with slang, jargon, euphemisms, and promiscuity of dialect, versus train with colloquial language, proper grammer&#x2F;syntax, and punctuation.