Ask HN: Share your "LLM screwed us over" stories?

91 点作者 ATechGuy4 个月前

Saw this today https://news.ycombinator.com/item?id=42575951 and thought that there might be more such cautionary tales. Please share your LLM horror stories for all of us to learn.

16 条评论

starchild30014 个月前

Guys, it's a major 21st century skill to learn how to use LLMs. In fact, it's probably the biggest skill anyone can develop today. So please be a responsible driver, learn how to use LLMs.Here's one way to get the most mileage out of them:1) Track the best and brightest LLMs via leaderboards (e.g. <a href="https://lmarena.ai/" rel="nofollow">https://lmarena.ai/</a>, <a href="https://livebench.ai/#/" rel="nofollow">https://livebench.ai/#/</a> ...). Don't use any s**t LLMs.2) Make it a habit to feed in whole documents and ask questions about them vs asking them to retrieve from memory.3) Ask the same question to the top ~3 LLMs in parallel (e.g. top of line Gemini, OpenAI and Claude models)4) Do comparisons between results. Pick best. Iterate on the the prompt, question and inputs as required.5) Validate any key factual information via Google or another search engine before accepting it as a fact.I'm literally paying for all three top AIs. It's been working great for my compute and information needs. Even if one hallucinates, it's rare that all three hallucinate the same thing at the same time. The quality has been fantastic, and intelligence multiplication is supreme.

lunarcave4 个月前

Perhaps the story that doesn't get told more often is how LLMs are changing how humans operate en masse.When ChatGPT came out, I was increasingly outsourcing my thinking to LMS. It took me a few months to figure out that that's actually harming me - I've lost my ability to think through things a little bit.The same is true for Coding Assistants; sometimes I disable the in-editor coding suggestions, when I find that my coding has atrophied.I don't think this is necessarily a bad thing, as long as LMs are ubiquitous and they proliferate throughout society and are extremely reliable and accessible. But they are not there today.

评论 #42577266 未加载

评论 #42577373 未加载

评论 #42583502 未加载

thisguy474 个月前

The linked post is more a story of someone not understanding what they're deploying. If they had found a random blog post about spot instances, they likely would have made the same mistake.In this case, the LLM suggested a potentially reasonable approach and the author screwed themselves by not looking into what they were trading off for lower costs.

评论 #42577024 未加载

评论 #42577037 未加载

bvanderveen4 个月前

I'm surprised at how even some of the smartest people in my life take the output of LLMs at face value. LLMs are great for "plan a 5 year old's birthday party, dinosaur theme", "design a work-out routine to give me a big butt", or even rubber-ducking through a problem.But for anything where the numbers, dates, and facts matter, why even bother?

评论 #42577176 未加载

评论 #42577018 未加载

评论 #42577165 未加载

评论 #42577056 未加载

评论 #42577329 未加载

评论 #42577274 未加载

评论 #42577050 未加载

评论 #42577109 未加载

评论 #42577794 未加载

评论 #42577348 未加载

olalonde4 个月前

Not me but Craig Wright aka Faketoshi referenced court cases hallucinated by a LLM in his appeal.<a href="https://cointelegraph.com/news/court-rejects-craig-wright-appeal-bitcoin-creator-case" rel="nofollow">https://cointelegraph.com/news/court-rejects-craig-wright-ap...</a>

评论 #42578138 未加载

mtmail4 个月前

ChatGPT claims our service has a feature which we don't have (for example tracking people based on their phone number). Users register a free account, then complain to us. The first email is often vague "It doesn't work" without details. Slightly worse is users who go ahead and make a purchase, then complain, then demand a refund. We had to add a warning on the account registration page.

LeoPanthera4 个月前

I'm currently shopping for a new car, and while asking questions at a dealer (not Tesla), they revealed that the sales guys use ChatGPT to look up information about the car because it's quicker than trying to find things in their own database.I did not buy that car.

troilboil4 个月前

Not mine but a client of mine. Consultants sold them a tool that didn't exist because the LLM hallucinated and told their salesperson it did. Not sure that's really the LLM's fault, but pretty funny.

评论 #42577755 未加载

alecco4 个月前

Share your "I used a tool and shot myself in the foot because I was lazy" stories.

评论 #42577047 未加载

评论 #42577038 未加载

woolion4 个月前

I've tried LLMs for a few exploratory programming projects. It kinda feels magical the first time you import a dependency you don't know and you get the LLM output what you want to do without you even having the time to think about it. However, I also think that for any minute I've gained with it I've lost at least one because of hallucinated solutions.Even for fairly popular things (Terraform+AWS) I continuously got plausible-looking answers. After reading carefully the docs, the use case was not supported at all, so I just went with the 30 seconds (inefficient) solution I had thought of from the start. But I lost more than one hour.Same story with the Ren'py framework. The issue is that the docs are far from covering everything, and Google sucks, sometimes giving you a decade-old answer to a problem that has a fairly good answer in more recent versions. So it's really difficult to decide how to most efficiently look for an answer between search and LLM. Both can be a stupid waste of time.

deadbabe4 个月前

For the story above always ask the LLM to answer as if they are a Hackernews commenter before requesting advice.

0xDEAFBEAD4 个月前

I find it interesting how LLM errors can be so subtle. The next-token prediction method rewards superficial plausibility, so mistakes can be hard to catch.

talldayo4 个月前

I think the conclusion in the thread is sound. If money means something to you, then don't ask AI for help spending it. Problem solved.

评论 #42576913 未加载

jpcookie4 个月前

Give me a real example of something in computer science they can't do? I'm interested since Chatgpt is better than any professor I've had at any level in my educational career.

byyoung34 个月前

doesn't seem like many people can come up with a specific and tangible example.

评论 #42578375 未加载

bflesch4 个月前

LLMs = ad-free version of googleThat's why people adopted it. Google got worse and worse, now the gap is filled with LLMs.LLMs have replaced google, and that's awesome. LLMs won't cook lunch or fold our laundry, and until a better technology comes around which can actually do that all promises around "AI" should be seen as grifting.

评论 #42577122 未加载

评论 #42577257 未加载

评论 #42578559 未加载

评论 #42577354 未加载