Generative AI could make search harder to trust

232 pointsby jedwhiteover 1 year ago

42 comments

I actually experienced this the other day. Bought the new Baldur's Gate and was wondering what items to keep or sell (don't judge me, I'm a pack rat in games!)I had found some silver ingots. The top search result for "bg3 silver ingot" is a content farm article that very confidently claims you can use them at a workbench in Act 3 to upgrade your weapons.Except this is a complete fabrication: silver ingots exist only to sell, and there is no workbench. There is no mechanic (short of mods) that allows you to change a weapon's stats.I'm pretty sure an LLM "helped" write the article because it's a lot of trouble to go through just to be straight up wrong - if you're a low effort content farm, why in the world would you go through the trouble if fabricating an entire game mechanic instead of taking the low effort "They exist only to be sold" road?This experience has caused me to start checking the date of search results: if it's 2022 and before, at least it's written by a human. If it's 2023 and on, I dust off my 90's "everything on the World Wide Web is wrong" glasses.

评论 #37785756 未加载

评论 #37785750 未加载

评论 #37788660 未加载

评论 #37872048 未加载

评论 #37788701 未加载

评论 #37786810 未加载

评论 #37787219 未加载

评论 #37785752 未加载

pseudosavantover 1 year ago

I wonder if there will be a human information/knowledge equivalent of low-background steel (pre-WWII/nukes). Data from before a certain point won't be 'contaminated' with LLM stuff, but it'll be everywhere after that.<a href="https://en.wikipedia.org/wiki/Low-background_steel" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Low-background_steel</a>

评论 #37784526 未加载

评论 #37783366 未加载

评论 #37783830 未加载

评论 #37783161 未加载

评论 #37785157 未加载

评论 #37785524 未加载

评论 #37783019 未加载

评论 #37792704 未加载

评论 #37782920 未加载

评论 #37785096 未加载

评论 #37785996 未加载

评论 #37785339 未加载

评论 #37785156 未加载

评论 #37784296 未加载

serfover 1 year ago

I think it'd be kind of neat in a backward way if we went back to the 'specialized encyclopedia' days of the 90s.Web directories , 'Who's Who in Engineering' type lists, etc.It's a step back from universal search engines being able to find stuff, but it's a step forward with regards to curation and quality of results; so i'm not sure if it's entirely a downgrade.The early 90s 'website phonebook' type encyclopedias were interesting[0], but I always had to remind my mom "No, this isn't the entire internet, it's just a bunch of places that people like; the secret ones are 'unlisted'."Note: I never say this is better than a search engine, it's just an interesting end-result after search engines got polluted and modified til the point of uselessness that we're at now with Google.[0]: <a href="https://www.amazon.com/Internet-Directory-Guide-Usenet-Bitnet/dp/0449908984" rel="nofollow noreferrer">https://www.amazon.com/Internet-Directory-Guide-Usenet-Bitne...</a>

评论 #37786008 未加载

评论 #37786025 未加载

评论 #37786354 未加载

salynchnewover 1 year ago

Recently an article came out where someone said that the company I work for is a big user of WebAssembly, but the reality is that we don't use it.After finding the contributed article (on a well-known news site, not Wired though), it looks like a tech founder might've been using ChatGPT to write an article about the uses for WASM. The arguments were generally sound, but I don't think that anyone did the work to manually check any of the facts they presented in it.

评论 #37784174 未加载

nonrandomstringover 1 year ago

More amusing and frightening is when people search about themselves and turn up AI generated crap. Googling yourself was always a lucky grab bag, with the possibility of long-forgotten embarrassments being dragged up. But at least you'd have to face facts.Now I hear of people discovering they're in prison, married to random people they've never met, or are actually already dead.What is this going to do to recon on individuals (for example by employers, border agents or potential romantic partners) when there's a good chance the reputation raffle will report you as a serial rapist, kiddy-fiddler or Tory politician?

评论 #37782645 未加载

评论 #37785224 未加载

评论 #37789397 未加载

评论 #37790108 未加载

lykahbover 1 year ago

The SEO garbage has been poisoning the search for years. Even before the chatbots it got to the point when most top results are crap. The LLM's can surely make it much worse, though.

评论 #37785732 未加载

评论 #37784855 未加载

faizshahover 1 year ago

I started to go down a line of thinking where I think we might see a return to books in the next 3-5 years. The reason is that with a book it’s a big collection of knowledge and people can post reviews about the quality of the book whereas on the web you have no way of knowing what quality of an article will be anymore.

评论 #37784438 未加载

zpetiover 1 year ago

Here's what people don't understand: this is mostly good for google.The worse organic results are, the more people will click on paid links. This is WHY everyone on HN is complaining about search results, because google doesn't really have an incentive to give you really good results. They only need to be good enough to keep 95% of the population still using google, but mostly expecting the good results to be ads.Google ads are the equivalent of verification on FB and X. They just call it something different. The verified, high quality results will be paid.

评论 #37785776 未加载

IronWolveover 1 year ago

it's almost like AI just repeats data its fed on, even incorrect data, without any real intelligence to determine if the data is correct.... /sIts not simply garbage in garbage out. There is no logic to verify and analyze the data. You are simply told what is popular in the data.

评论 #37783286 未加载

评论 #37783915 未加载

评论 #37783098 未加载

gumballindieover 1 year ago

The correct term is spamming. People are using these text generators to spam everyone and everything under the sun. It will be detrimental to the internet as many people will just give on this huge pile of ... spam.

评论 #37785520 未加载

tivertover 1 year ago

We did it guys! We're definitely heading into a new era, one perfected by software engineers. I can't wait!

评论 #37792707 未加载

joweaover 1 year ago

AI powered citogenesis!I'm starting to wish articles had inline citations as a standard.

评论 #37783461 未加载

kiernanmcgowanover 1 year ago

Without naming the company, I have seen specific examples of blog posts being written by AI, hallucinating a "fact", and then that "fact" re-surfacing inside of Bard.Its xkcd's Citogenesis automated and at internet scale <a href="https://xkcd.com/978/" rel="nofollow noreferrer">https://xkcd.com/978/</a>

评论 #37785618 未加载

abruzziover 1 year ago

I have to say--the opening paragraph doesn't describe a reality I'm familiar with:>Web search is such a routine part of daily life that it’s easy to forget how marvelous it is. Type into a little text box and a complex array of technologies—vast data centers, ravenous web crawlers, and stacks of algorithms that poke and parse a query—spring into action to serve you a simple set of relevant results.Web search has, for me, become a nasty twisted hall of mirrors well before generative AI. I almose never get fed relevant results, I alsmost always have to go back and quote all my search terms because the search engine decided it didn't really need to use all of them (usually just one.) The only difference is the poison was human generated. generative AI will simply erase the 5% of results that might give me an answer quickly.

评论 #37783426 未加载

评论 #37784248 未加载

评论 #37783545 未加载

评论 #37784336 未加载

评论 #37784372 未加载

评论 #37783517 未加载

notamyover 1 year ago

<a href="https://archive.ph/2023.10.05-165142/https://www.wired.com/story/fast-forward-chatbot-hallucinations-are-poisoning-web-search/" rel="nofollow noreferrer">https://archive.ph/2023.10.05-165142/https://www.wired.com/s...</a>

michaelteterover 1 year ago

The signal to noise ratio of web search results has been trending toward utter uselessness for years; so while AI content will make it worse, it won't make it dramatically worse. We'll just advance toward useless at a higher rate.

评论 #37787126 未加载

qwerty456127over 1 year ago

You should never have had been trusting what you find on the web, let alone other media. I hope widespread usage of generative AIs including deepfakes will finally force the masses to start thinking more critically.

评论 #37787042 未加载

评论 #37788320 未加载

abujazarover 1 year ago

«Could»? Google has already been doing this for quite some time, at least in my region (Norway), and I’d say more than half of the suggestions Google provides as top results are false.

liampullesover 1 year ago

I think the insufficient accuracy in the output of LLMs is going to lead them to be a lot more niche then the current hype is hoping for. I think most people care that someone is taking accountability for what they are reading - not that it is necessarily correct but at least that someone thinks it is correct (and that someone can be taken to task if it is inaccurate).If LLM usage in media becomes widespread, I'd pay for a service that identifies and hides the LLM shit for me.

hypnoosiover 1 year ago

It's great to hear that someone finds Replit's AI capabilities useful for accelerating their learning... However, I must point out that the post sounds like an advertisement for an overpriced product. Not poking the AI assistance in coding, it definitely can be valuable, but the effectiveness vs. cost-efficiency varies. 20 dollars per month for coding a chatbot, not worth it for me...

anigbrowlover 1 year ago

Surely, just as content farms have gradually trashed the quality of search results on major platforms. There's also an over-reliance on raw quantities; I often get irrelevant and unwanted news articles from India simply because the huge population of that country coupled with widespread use of English outweighs US content on the social graph.

whb101over 1 year ago

This is a pretty intractable problem and my app is in the alpha-est of stages, but I built something for this purpose. It maps creators on a 2D grid (using React-flow) based on subject and lets users vote on their trustworthiness. <a href="https://www.graphting.org" rel="nofollow noreferrer">https://www.graphting.org</a>

Condition1952over 1 year ago

Please get your answers from Anna’s Library

kordlessagainover 1 year ago

It'll make it easier to trust if you have your own index of documents. That's why I built this: <a href="https://mitta.us/" rel="nofollow noreferrer">https://mitta.us/</a>There still exists a problem that users need to run and manage their own indexes, at times.

figassisover 1 year ago

I think we all saw this coming, talked about it, articles were published even...but now its news

ironborn123over 1 year ago

Wasnt there a paper a few months back, Textbooks are all you need. yes found it <a href="https://arxiv.org/abs/2306.11644" rel="nofollow noreferrer">https://arxiv.org/abs/2306.11644</a>So search engines in their traditional sense will be obsolete anyway.1) GPT-4 and other such LLMs will generate textbooks and manuals for every conceivable topic.2) These textbooks will be 'dehallucinated' and curated by known experts on particular topics, who have reputations to maintain. The experts' names will be advertised by the LLM provider.3) People will search for stuff by chatting with the LLMs, which will in turn provide citations for the chat output from the curated textbooks.

dishsoapover 1 year ago

This article's bout a year late, it has already happened.

kakazover 1 year ago

Well, did you have to use Microsoft documentation ? Imagine whole Internet presenting that content, with random flashes of animations and html5 whistles

anjelover 1 year ago

More than Pinterest?

SubiculumCodeover 1 year ago

I thought we were already there, just by paid foreign labor spammin every public discussion board.

kidsilover 1 year ago

As opposed to today where half my search results are the result of who has the larger SEO budget?

Havocover 1 year ago

Yep. Same with code too. Confidently generates code using endpoints that don’t exist

LetsGetTechniclover 1 year ago

Just another reason that I consider generative AI to be a lot like crypto. A lot of talk about it being the future but really only turns out to be dangerous or useless. I find it incredibly irresponsible that companies are shoving their latest AI tech into all their products when it's still unproven.

评论 #37782969 未加载

评论 #37782776 未加载

评论 #37784414 未加载

jeffreyw128over 1 year ago

It’s especially terrifying that misinformation compounds multiplicatively with AI because it happens in 2 layers - once at the retrieval layer (where AI-generated content is worsening the problem of bad SEO content) and again at the retrieval augmented generation (RAG) LLM layer.(shameless plug) At Metaphor (<a href="https://platform.metaphor.systems/">https://platform.metaphor.systems/</a>), we’re building a search engine that avoids SEO content by relying on human curation + neural embeddings for our index + retrieval algorithm. Our mission is to ensure that the information we receive is as high quality and truthful as possible as AI adoption marches onwards. You (or your LLM) can feel free to give it a try :)

评论 #37785393 未加载

daniel_iversenover 1 year ago

At first I thought the article was going to be about human-led misinformation but I wonder whether with both hallucinations and human-fed misinformation (AI-helped or not!) whether we can use AI to fact/self check results (both AI generated and human ones) and prompt us about potential misinformation and link to relevant sources? That way AI could actually help solve the trust issue.

ameliusover 1 year ago

We need more research into content moderation.

smeagullover 1 year ago

Lies, misinformation and blogspam has always existed. The problem hasn't changed at all.

mattlondonover 1 year ago

Or to use the technical term: "shat the bed". Welcome to the future.

23B1over 1 year ago

That really sucks for all the people whose job it is to make search impossible to trust already /s

throwawaaarrghover 1 year ago

Why are people calling them hallucinations and not just errors, flaws or bugs? You can't hallucinate if all of your perception is one internal state. Chatbots don't dream of electric sheep.

评论 #37784466 未加载

p0w3n3dover 1 year ago

And entropy rises... people thought AI will kill us with machine guns. AI will kill us by making us super stupid...

评论 #37783307 未加载

infoseek12over 1 year ago

Leaving aside the article to discuss the source for a moment. When did Wired become so antitech?There are good critical viewpoints but most of the articles they are putting out at this point read like bitter diatribes. Which is a shame because they used to be an excellent publication.

评论 #37782273 未加载

评论 #37786751 未加载

评论 #37781829 未加载

评论 #37788422 未加载