Teach your LLM to answer with facts, not fiction

175 pointsby jinqueenyalmost 2 years ago

23 comments

jameshartalmost 2 years ago

'Facts' aren't as black and white as people think.<pre><code> "What does Charmander evolve into?" "What does the spell 'avada kedavra' do?" "What is the Sindarin word for 'friend'?" "What are the names of Santa's reindeer?" "Where did Robin Hood live?" "Where did Achilles die?" </code></pre> These are all 'factual questions' you can find answers to from reputable sources like Wikipedia. Google displays 'fact boxes' for several of them. Wolfram Alpha provides answers for three of them. The answers to some of these questions are part of what passes for 'general knowledge' in some societies.It's no surprise that LLMs trained on human writings produce text which claims things that aren't true are facts. Humans do that all the time.There are well attested reputable sources that will tell you Abraham Lincoln was a vampire hunter, others that say he was a Lego Master Builder, and others still will tell you that among his notable quotes is "Party on dudes - be excellent to each other". So what's an LLM to do when it's trying to extend a paragraph of information about Abraham Lincoln?When an LLM is suggesting what might come next in a piece of text... it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic. So you get a bit of all that.

评论 #36843994 未加载

评论 #36843681 未加载

评论 #36846085 未加载

评论 #36845598 未加载

评论 #36844640 未加载

评论 #36844114 未加载

评论 #36844386 未加载

评论 #36846377 未加载

评论 #36855669 未加载

评论 #36842915 未加载

评论 #36848451 未加载

评论 #36843775 未加载

评论 #36848105 未加载

d1sxeyesalmost 2 years ago

"The Hitch Hiker’s Guide to the Galaxy; is an indispensable companion to all those who are keen to make sense of life in an infinitely complex and confusing universe. For though it cannot hope to be useful or informative on all matters, it does make the reassuring claim that where it is inaccurate, it is at least definitively inaccurate. In cases of major discrepancy it is always reality that’s got it wrong. So, for instance, when the Guide was sued by the families of those who had died as a result of taking the entry on the planet Traal literally - it said “Ravenous Bugblatter Beasts often make a very good meal for visiting tourists” instead of “Ravenous Bugblatter Beasts often make a very good meal of visiting tourists” - the editors claimed that the first version of the sentence was the more aesthetically pleasing; summoned a qualified poet to testify under oath that beauty was truth, truth beauty, and hoped thereby to prove that the guilty party in this case was life itself for failing to be either beautiful or true."

Lercalmost 2 years ago

It is not a good start that they begin with a dictionary definition of Hallucinations. While the similarities to what a LLM does are apparent enough for the term to be used, LLMs are under no obligation to behave similar to the dictionary definition of Hallucinations.In general facts are not the answer to Hallucinations. You can't possibly have every fact for every situation. The true solution to Hallucinations is figuring out how to make a model say 'I don't know"

评论 #36842931 未加载

评论 #36841711 未加载

评论 #36842072 未加载

NoZebra120vClipalmost 2 years ago

You know, aside from this being a blatant feature-length advertisement for what they're selling, I almost thought this was a clever idea.I thought it involved prompting the LLM to write SQL code to query a knowledge base of documents, and index into them, so that you'd know where to look in the original documents for your authoritative answer. So it would be a meta-search agent.But apparently, they intend the queried documents to feed back into training the LLM? That's just gasoline on a dumpster fire.

评论 #36848528 未加载

评论 #36844017 未加载

hax0ron3almost 2 years ago

I am mostly a novice to the field of LLMs, but as a layman who has a basic but admittedly very rough understanding of how they work algorithmically, I have a hunch that the same thing that makes these LLMs powerful AIs that have interesting emergent behaviors is also what makes them occasionally get things wildly wrong and claim to know things that they do not know. They are supposed to be AIs, not carefully vetted encyclopedias. Sure, I get that people want to eventually use AI to do life-critical stuff like surgery, and at that point "hallucinations" become a real problem. But we are nowhere close to that point yet, I think, so I feel that the focus on "hallucinations" may be misleading. It is one thing to try to get a 30 year old doctor to not make nonsense up on the fly while at work, that makes sense. But if you try to prevent a 3 year old kid from making up nonsense, that will actually probably hurt his development into a more powerful intelligence. Note: I know that the current popular LLMs do not actually learn past the scope of a single session, but I am sure that they soon will.

评论 #36844129 未加载

评论 #36844051 未加载

wizzwizz4almost 2 years ago

Situation: people try to use these predictive text chatbots as search engines.Problem: LLMs are not search engines. They extrapolate, interpolate, and approximate (so-called “hallucinations”) so they can always produce somewhat-plausible text completions.Solution: Create a search engine so good at returning relevant results that even an LLM can make use of it… then go to significant lengths to plug that search engine into the LLM, preventing people from reading the search results directly.Why not simply give people access to the search engine‽ People know how to use search engines!! This is the fifth time I've seen an article like this, and I'm still… baffled. It's <a href="https://xkcd.com/2021/" rel="nofollow noreferrer">https://xkcd.com/2021/</a> all over again.

评论 #36842515 未加载

评论 #36842151 未加载

评论 #36842726 未加载

评论 #36842205 未加载

评论 #36845677 未加载

flanked-everglalmost 2 years ago

I think the author of that title could well do with a refresher course in epistemology and physics, as it is just not possible to do what they suggest. But even more unfortunate is how many people fall for deceptive marketing that really should not even fool the average 16-year-old.

评论 #36844531 未加载

评论 #36845068 未加载

peterbonneyalmost 2 years ago

If “allowing the execution of arbitrary database queries written by an LLM inside a SaaS application” is the answer, I’d love to know what the question is.

评论 #36842763 未加载

inopinatusalmost 2 years ago

Please don’t dump untreated content marketing in the reading fountain.

评论 #36844085 未加载

hertzrutalmost 2 years ago

For the life of me I can't understand why so many are obsessed with LLMs as search engines and knowledge databases when it seems like it's impossible for them to be that.

评论 #36846573 未加载

SkyPuncheralmost 2 years ago

What's not jumping out to me is why you need an LLM for this? What unique code is the LLM actually generating when de-hallucinating?From the looks of it, they're just advertising a SQL extension that adds vectorization and vector search. Further, it looks like the only thing that the LLM is doing here is deciding which column to run the vector search on. Why is that even necessary? Why are you not pre-processing "vector'd" columns into a normalized format to query against?They're basically adding an unnecessary LLM step to what amounts to a vector search. In fact, the LLM is essentially blindly deciding which column is the best column to pull an answer from.-----EDIT: Just struck me how terrifyingly dangerous this blog is. Really tired of seeing this crap in the LLM community.The basic premise of this blog is "give an LLM complete access to your database. Let it decide how and where it should pull data from". This is basically useless without talking about how you prevent the LLM from pulling data from places you don't want it to.A far better and safer approach remains to push your relevant fields to a separate place for your LLM. In the spirit of this blog, you should just index to a new table. More realistically, you should just put this in a vector store.

skybrianalmost 2 years ago

That will help a bit, but it's not going to fix it.Using GPT4 and Code Interpreter, I have asked it to write a function and test it, given some inputs and expected outputs. The function returned different values when it tested it, but it lied and said it worked as expected.You need to read the code and the test outputs yourself. Or maybe have it write an automated test?Despite this, it seems quite promising. I expect that in a year or two, some IDE's will come with a useful pair programming feature.

评论 #36843225 未加载

quickthrower2almost 2 years ago

How to solve problem X (... but not really) ... use our product.

RyEgswuCsnalmost 2 years ago

I think LLMs need to be taught to say "I don't know"/"I am not sure" or something to that effect. Another approach might be to introduce an adversarial "censor" model to guard against hallucination (or inappropriate answers).

评论 #36844474 未加载

评论 #36847984 未加载

评论 #36849745 未加载

kmeisthaxalmost 2 years ago

One of the fun things I like to do with these examples is follow along at home. I don't have access to GPT-4 because I refuse to give money to OpenAI, but with GPT-3.5, "What is an LLM hallucination" actually gives:- The usual knowledge cutoff warning- An explanation of what a hallucination isIf (in a separate conversation) I give GPT-3.5 the exact prompt that explains what an LLM is, I get gaslighted instead. GPT-3.5 attempts to tell me that LLM stands for "Legal Master of Laws". Then it gives the knowledge cutoff warning, and then the same correct explanation Myscale got.The rest of this article appears to be trying to turn GPT into a frontend for search engines. I don't know why people keep trying to do this.

评论 #36842895 未加载

评论 #36842612 未加载

masukomialmost 2 years ago

I really REALLY wish people would stop assigning sentience to LLMs.> In other words, a hallucination is an error in (or a false) perception of something real or concrete.an llm has no "perception" it doesn't "believe" or "think" that the answers it provides are "correct" or "true" or even "false". It's just autocompleting strings with the most probably next words.If we keep treating these things as if they're sentient entities that "want" to provide "correct" answers we're going to keep tripping over our false assumptions about their answers.

评论 #36856305 未加载

afro88almost 2 years ago

I was hoping this would be a way to train an LLM that somehow knew when to seek out external knowledge and when not to.I guess this is a pretty unsolvable problem with current architectures. There's just no concrete "confidence" value. I mean, an LLM will give you a probable value for what confidence could be given the words preceding it, but that's an entirely different thing

Renaudalmost 2 years ago

Does anyone know if there is a “wikipedia-only” or way to constrain the knowledge of a LLM (trained on a more limited set of sources or constrained by limited sources of facts)?I think a LLM interface to Wikipedia could be useful, at least I imagine it would.

senectus1almost 2 years ago

would be nice if they could show a gradient score on results that show how certain it is of its answers...It should be fairly trivial for it to tell you how often its straight up lied about something.

评论 #36848623 未加载

评论 #36845787 未加载

评论 #36846156 未加载

butzalmost 2 years ago

Worst part is when you start questioning, how LLM came up with an answer, and it just adds your suggestion to its answer.

isaacfungalmost 2 years ago

I dont know why so many people keep repeating the trivial fact that we cant "eliminate" hallucinations. We cant eliminite misinformation from google, social media or people we know either. Best we can try:<pre><code> 1) better filter the training data 2) design better retrieval and reranking algorithms 3) when context information is provided, make it use the sources and cite the sources (use extractive QA to highlight which part of the source is relevant. This is the type of hallucinations that we should focus on as we can compare the generated result and the context to detect the hallucinations) 4) make the llm break down its reasoning into small steps that can be validated inidividually (COT, PAL) </code></pre> There are some research on how to manipulate the logits during decoding to make the generated text satisfy certain contraints. I suspect that we can use these techniques to make the LLM stick to the provided context.<pre><code> - Controllable Text Generation with Language Constraints - Classifiers are Better Experts for Controllable Text Generation - Stay on topic with Classifier-Free Guidance</code></pre>

aleccoalmost 2 years ago

blogspam

vouaobrasilalmost 2 years ago

I believe that LLMs should be banned, but if they have to exist, we should teach them ethics first before anything else.

评论 #36842372 未加载

评论 #36848093 未加载

评论 #36844946 未加载

评论 #36845800 未加载