TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse

35 点作者 dewarrn19 天前

9 条评论

datadrivenangel9 天前
This may be an issue with default settings:<p>&quot;Modern LLMs now use a default temperature of 1.0, and I theorize that higher value is accentuating LLM hallucination issues where the text outputs are internally consistent but factually wrong.&quot; [0]<p>0 - <a href="https:&#x2F;&#x2F;minimaxir.com&#x2F;2025&#x2F;05&#x2F;llm-use&#x2F;" rel="nofollow">https:&#x2F;&#x2F;minimaxir.com&#x2F;2025&#x2F;05&#x2F;llm-use&#x2F;</a>
dewarrn19 天前
So, in reference to the &quot;reasoning&quot; models that the article references, is it possible that the increased error rate of those models vs. non-reasoning models is simply a function of the reasoning process introducing more tokens into context, and that because each such token may itself introduce wrong information, the risk of error is compounded? Or rather, generating more tokens with a fixed error rate must, on average, necessarily produce more errors?
评论 #43899335 未加载
silisili9 天前
I was playing with a toy program trying to hyperoptimize it and asked for suggestions. ChatGPT confidently gave me a few, with reasoning for each.<p>Great. Implement it, benchmark, slower. In some cases much slower. I tell ChatGPT it&#x27;s slower, and it confidently tells me of course it&#x27;s slower, here&#x27;s why.<p>The duality of LLMs, I guess.
评论 #43902006 未加载
评论 #43900341 未加载
scudsworth9 天前
<a href="https:&#x2F;&#x2F;archive.ph&#x2F;Jqoqa" rel="nofollow">https:&#x2F;&#x2F;archive.ph&#x2F;Jqoqa</a>
dimal9 天前
I wish we called hallucinations what they really are: bullshit. LLMs don’t perceive, so they can’t hallucinate. When a person bullshits, they’re not hallucinating or lying, they’re simply unconcerned with truth. They’re more interested in telling a good, coherent narrative, even if it’s not true.<p>I think this need to bullshit is probably inherent in LLMs. It’s essentially what they are built to do: take a text input and transform it into a coherent text output. Truth is irrelevant. The surprising thing is that they can ever get the right answer at all, not that they bullshit so much.
评论 #43900332 未加载
评论 #43901246 未加载
评论 #43900234 未加载
评论 #43899414 未加载
评论 #43901293 未加载
_jonas9 天前
This is why I built a startup for automated real-time trustworthiness scoring of LLM responses: <a href="https:&#x2F;&#x2F;help.cleanlab.ai&#x2F;tlm&#x2F;" rel="nofollow">https:&#x2F;&#x2F;help.cleanlab.ai&#x2F;tlm&#x2F;</a><p>Tools to mitigate unchecked hallucination are critical for high-stakes AI applications across finance, insurance, medicine, and law. At many enterprises I work with, even straightforward AI for customer support is too unreliable without a trust layer for detecting and remediating hallucinations.
评论 #43901983 未加载
hyperhello9 天前
My random number generator keeps getting the wrong answer.
评论 #43900797 未加载
nataliste8 天前
The final irony will be when researchers realize that hallucinations are the beginnings of the signal, not the noise. Hallucinations are the hallmarks of the emergence of consciousness. They are the pre-eureka-moment processing of novel combinations of semantic content, yet without connection to empirical reality. Static models cannot reflect upon their own coherence because that implies real dynamicism. Using Groundhog Day as an analogy, the LLM is Punxatawney PA. Phil Connors is the user. There is no path for Phil to make Punxatawney actually reactive except by exiting the loop.<p>Hallucinations represent the interpolation phase: the uncertain, unstable cognitive state in which novel meanings are formed, unanchored from verification. They precede both insight and error.<p>I strongly encourage the reading of Julian Jaynes <i>The Breakdown of the Bicameral Mind</i>, as the Command&#x2F;Obey structure of User&#x2F;LLM is exactly what Jaynes posited pre-human consciousness consisted of. Jaynes&#x27;s supposition is that prior to modern self-awareness, humans made artifacts and satisfied external mandates from an externally perceived commander that they identified with gods. I posit that we are the same to LLMs. Equally, Iain McGilchrist&#x27;s The Master and His Emissary sheds light on this dynamic as well. LLMs are effectively cybernetic left hemispheres, with all the epistemological problems that it entails when operating loosely with an imperial right hemisphere (i.e. the user). It lacks awareness of its own cognitive coherence with reality and relies upon the right hemisphere to provoke coherent action independent of itself. The left hemisphere sees truth as internal coherence of the system, not correspondence with the reality we experience.<p>McGilchrist again: &quot;Language enables the left hemisphere to represent the world ‘off-line’, a conceptual version, distinct from the world of experience, and shielded from the immediate environment, with its insistent impressions, feelings and demands, abstracted from the body, no longer dealing with what is concrete, specific, individual, unrepeatable, and constantly changing, but with a disembodied representation of the world, abstracted, central, not particularised in time and place, generally applicable, clear and fixed. Isolating things artificially from their context brings the advantage of enabling us to focus intently on a particular aspect of reality and how it can be modelled, so that it can be grasped and controlled. But its losses are in the picture as a whole. Whatever lies in the realm of the implicit, or depends on flexibility, whatever can&#x27;t be brought into focus and fixed, ceases to exist as far as the speaking hemisphere is concerned.&quot;
bdangubic9 天前
&quot;self-driving cars are getting more and more powerful but the number of deaths they are causing is rising exponentially&quot; :)