TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Share your "LLM screwed us over" stories?

91 点作者 ATechGuy4 个月前
Saw this today https://news.ycombinator.com/item?id=42575951 and thought that there might be more such cautionary tales. Please share your LLM horror stories for all of us to learn.

16 条评论

starchild30014 个月前
Guys, it&#x27;s a major 21st century skill to learn how to use LLMs. In fact, it&#x27;s probably the biggest skill anyone can develop today. So please be a responsible driver, learn how to use LLMs.<p>Here&#x27;s one way to get the most mileage out of them:<p>1) Track the best and brightest LLMs via leaderboards (e.g. <a href="https:&#x2F;&#x2F;lmarena.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lmarena.ai&#x2F;</a>, <a href="https:&#x2F;&#x2F;livebench.ai&#x2F;#&#x2F;" rel="nofollow">https:&#x2F;&#x2F;livebench.ai&#x2F;#&#x2F;</a> ...). Don&#x27;t use any s**t LLMs.<p>2) Make it a habit to feed in whole documents and ask questions about them vs asking them to retrieve from memory.<p>3) Ask the same question to the top ~3 LLMs in parallel (e.g. top of line Gemini, OpenAI and Claude models)<p>4) Do comparisons between results. Pick best. Iterate on the the prompt, question and inputs as required.<p>5) Validate any key factual information via Google or another search engine before accepting it as a fact.<p>I&#x27;m literally paying for all three top AIs. It&#x27;s been working great for my compute and information needs. Even if one hallucinates, it&#x27;s rare that all three hallucinate the same thing at the same time. The quality has been fantastic, and intelligence multiplication is supreme.
lunarcave4 个月前
Perhaps the story that doesn&#x27;t get told more often is how LLMs are changing how humans operate en masse.<p>When ChatGPT came out, I was increasingly outsourcing my thinking to LMS. It took me a few months to figure out that that&#x27;s actually harming me - I&#x27;ve lost my ability to think through things a little bit.<p>The same is true for Coding Assistants; sometimes I disable the in-editor coding suggestions, when I find that my coding has atrophied.<p>I don&#x27;t think this is necessarily a bad thing, as long as LMs are ubiquitous and they proliferate throughout society and are extremely reliable and accessible. But they are not there today.
评论 #42577266 未加载
评论 #42577373 未加载
评论 #42583502 未加载
thisguy474 个月前
The linked post is more a story of someone not understanding what they&#x27;re deploying. If they had found a random blog post about spot instances, they likely would have made the same mistake.<p>In this case, the LLM suggested a potentially reasonable approach and the author screwed themselves by not looking into what they were trading off for lower costs.
评论 #42577024 未加载
评论 #42577037 未加载
bvanderveen4 个月前
I&#x27;m surprised at how even some of the smartest people in my life take the output of LLMs at face value. LLMs are great for &quot;plan a 5 year old&#x27;s birthday party, dinosaur theme&quot;, &quot;design a work-out routine to give me a big butt&quot;, or even rubber-ducking through a problem.<p>But for anything where the numbers, dates, and facts matter, why even bother?
评论 #42577176 未加载
评论 #42577018 未加载
评论 #42577165 未加载
评论 #42577056 未加载
评论 #42577329 未加载
评论 #42577274 未加载
评论 #42577050 未加载
评论 #42577109 未加载
评论 #42577794 未加载
评论 #42577348 未加载
olalonde4 个月前
Not me but Craig Wright aka Faketoshi referenced court cases hallucinated by a LLM in his appeal.<p><a href="https:&#x2F;&#x2F;cointelegraph.com&#x2F;news&#x2F;court-rejects-craig-wright-appeal-bitcoin-creator-case" rel="nofollow">https:&#x2F;&#x2F;cointelegraph.com&#x2F;news&#x2F;court-rejects-craig-wright-ap...</a>
评论 #42578138 未加载
mtmail4 个月前
ChatGPT claims our service has a feature which we don&#x27;t have (for example tracking people based on their phone number). Users register a free account, then complain to us. The first email is often vague &quot;It doesn&#x27;t work&quot; without details. Slightly worse is users who go ahead and make a purchase, then complain, then demand a refund. We had to add a warning on the account registration page.
LeoPanthera4 个月前
I&#x27;m currently shopping for a new car, and while asking questions at a dealer (not Tesla), they revealed that the sales guys use ChatGPT to look up information about the car because it&#x27;s quicker than trying to find things in their own database.<p>I did not buy that car.
troilboil4 个月前
Not mine but a client of mine. Consultants sold them a tool that didn&#x27;t exist because the LLM hallucinated and told their salesperson it did. Not sure that&#x27;s really the LLM&#x27;s fault, but pretty funny.
评论 #42577755 未加载
alecco4 个月前
Share your &quot;I used a tool and shot myself in the foot because I was lazy&quot; stories.
评论 #42577047 未加载
评论 #42577038 未加载
woolion4 个月前
I&#x27;ve tried LLMs for a few exploratory programming projects. It kinda feels magical the first time you import a dependency you don&#x27;t know and you get the LLM output what you want to do without you even having the time to think about it. However, I also think that for any minute I&#x27;ve gained with it I&#x27;ve lost at least one because of hallucinated solutions.<p>Even for fairly popular things (Terraform+AWS) I continuously got plausible-looking answers. After reading carefully the docs, the use case was not supported at all, so I just went with the 30 seconds (inefficient) solution I had thought of from the start. But I lost more than one hour.<p>Same story with the Ren&#x27;py framework. The issue is that the docs are far from covering everything, and Google sucks, sometimes giving you a decade-old answer to a problem that has a fairly good answer in more recent versions. So it&#x27;s really difficult to decide how to most efficiently look for an answer between search and LLM. Both can be a stupid waste of time.
deadbabe4 个月前
For the story above always ask the LLM to answer as if they are a Hackernews commenter before requesting advice.
0xDEAFBEAD4 个月前
I find it interesting how LLM errors can be so subtle. The next-token prediction method rewards superficial plausibility, so mistakes can be hard to catch.
talldayo4 个月前
I think the conclusion in the thread is sound. If money means something to you, then don&#x27;t ask AI for help spending it. Problem solved.
评论 #42576913 未加载
jpcookie4 个月前
Give me a real example of something in computer science they can&#x27;t do? I&#x27;m interested since Chatgpt is better than any professor I&#x27;ve had at any level in my educational career.
byyoung34 个月前
doesn&#x27;t seem like many people can come up with a specific and tangible example.
评论 #42578375 未加载
bflesch4 个月前
LLMs = ad-free version of google<p>That&#x27;s why people adopted it. Google got worse and worse, now the gap is filled with LLMs.<p>LLMs have replaced google, and that&#x27;s awesome. LLMs won&#x27;t cook lunch or fold our laundry, and until a better technology comes around which can actually do that all promises around &quot;AI&quot; should be seen as grifting.
评论 #42577122 未加载
评论 #42577257 未加载
评论 #42578559 未加载
评论 #42577354 未加载