TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

LLMs Don't Know What They Don't Know–and That's a Problem

54 点作者 ColinEberhardt3 个月前

12 条评论

corytheboyd3 个月前
The day can’t come fast enough where we just see things like this as trivial misuse of the tool— like using a hammer to drive in a screw. We use the hammer for nails and the screwdriver for screws. We use LLM for exploring data with language, and our brains for reasoning.
评论 #43284549 未加载
评论 #43285650 未加载
评论 #43284646 未加载
评论 #43285885 未加载
johnisgood3 个月前
Claude does ask questions for clarification or asks me to provide something it does not know though, at least it has happened many times to me. At other times I will have to ask if it needs X or Y to be able to answer more accurately, although this may be the same case with other LLMs, too. The former though was quite a surprise to me, coming from GPT.
评论 #43285197 未加载
评论 #43285545 未加载
评论 #43285982 未加载
评论 #43284986 未加载
CaffeineLD503 个月前
I would suggest that LLMs don&#x27;t actually know anything. The knowing is inferred.<p>An LLM might be seen as a kind of very elaborate linguistic hoax (at least as far as knowledge and intelligence are concerned).<p>And I like LLMs, don&#x27;t get me wrong. I&#x27;m not a hater.
评论 #43287157 未加载
zamadatix3 个月前
I wonder how much of this is an inherent problem which is hard to work a solution into vs &quot;confidently guessing the answer every time yields a +x% gain for a modelon all of the other benchmark results so nobody wants to reward opposite of that&quot;.
rowanseymour3 个月前
I use copilot every day and every day I&#x27;m more and more convinced that LLMs aren&#x27;t going to rule the world but will continue to be &quot;just&quot; neat autocomplete tools whose utility degrades the more you expect from them.
评论 #43285622 未加载
missedthecue3 个月前
Well humans don&#x27;t know what they don&#x27;t know either. I think the bigger problem is that LLMs don&#x27;t know what they <i>do</i> know.
评论 #43286090 未加载
评论 #43286096 未加载
评论 #43286108 未加载
评论 #43286118 未加载
NoPicklez3 个月前
Is it that they&#x27;re over confident, or that we are over confident in their responses also.<p>LLM&#x27;s aren&#x27;t an all knowing power, much like ourselves, but we still take the opinions and ideas of others as true to some extent.<p>If you are using LLM&#x27;s and taking their outputs as complete truths or working products, then you&#x27;re not using them correctly to begin with. You need to exercise a degree of professional and technical skepticism with their outputs.<p>Luckily LLM&#x27;s are moving into the arena of being able to reason with themselves and test their assumptions before giving us an answer.<p>LLM&#x27;s can push me in the wrong direction just as much as an answer to a problem on a forum.
nottorp3 个月前
LLMs don&#x27;t know period. They can be useful to summarize well and redundantly publicized information, but they don&#x27;t &quot;know&quot; even that.
red-iron-pine3 个月前
to quote someone else: &quot;at least when I ask an intern to find something they&#x27;ll usually tell me they don&#x27;t know and then flail around; AI will just lie with full confidence to my face&quot;
评论 #43296181 未加载
评论 #43291852 未加载
spwa43 个月前
LLMs learn from the internet. Refuse to admit they don&#x27;t know something. I have to admit I&#x27;m not entirely surprised by this.
评论 #43284734 未加载
评论 #43284990 未加载
sheepscreek3 个月前
I consider this to be a solved problem. Reasoning models are exceptionally good at this. In fact, if you use ChatGPT with Deep Research, it can bug you with questions to the point of annoyance!<p>Could have also been the fact that my custom GPT instructions included stuff like “ALWAYS clarify something if you don’t understand. Do not assume!”
评论 #43287169 未加载
评论 #43287396 未加载
TZubiri3 个月前
I find most articles of the sort &quot;LLMs have this flaw&quot; to be of a cynical one-upmanship kind.<p>&quot;If you say please LLMs think you are a grandma&quot;. Well then don&#x27;t say you are a grandma. At this point we have a rough idea of what these things are, what their limitations are, people are using them to great effect in very different areas, their objective is usually to hack the LLM into doing useful stuff, while the article writers are hacking the LLM into doing stuff that is wrong.<p>If a group of guys is making applications with an LLM and another dude is making shit applications with the LLM, am I supposed to be surprised at the latter instead of the former? Anyone can do an LLM do weird shit, the skill and area of interest is in the former.