Is the reversal curse in LLMs real?

223 pointsby tedsandersover 1 year ago

38 comments

gippover 1 year ago

I fall somewhere on the skeptic side of the LLM spectrum. But this "flaw" just does not seem to have the force that its proponents seem to think it does, unless I'm missing something significant. Simply because in the context of natural language (Rather than formal logical statements), "A is B" does not imply "B is A" in the first place. "Is" can encompass a wide variety of logical relationships in colloquial usage, not just exact identity. "The apple is red" does not imply "red is the apple," as a trivial example.

评论 #38272348 未加载

评论 #38272309 未加载

评论 #38272415 未加载

评论 #38277090 未加载

评论 #38272939 未加载

评论 #38274942 未加载

评论 #38274055 未加载

评论 #38274057 未加载

评论 #38272640 未加载

评论 #38272304 未加载

评论 #38272820 未加载

评论 #38274322 未加载

评论 #38275388 未加载

评论 #38272721 未加载

评论 #38276552 未加载

评论 #38272230 未加载

评论 #38276583 未加载

评论 #38273715 未加载

tempestnover 1 year ago

> This isn’t a failure of neural networks. It’s a feature. It’s why you’re not flooded with every single memory and experience you’ve ever had every moment.This is an interesting point, and made me think on whether this "reversal curse" is something we experience with our own, human neural networks. I think it is. Like, I can imagine being given a character in a movie, being able to tell you what actor played them, but given the actor and the movie, not being able to tell you what character they played, or vice-versa. So the pairing exists in my brain somewhere, but I only have an accessible pointer to one side. I think we run into cases like this all the time, actually.

评论 #38272305 未加载

评论 #38272428 未加载

zapperdulchenover 1 year ago

I played along the example of the chancellor question.If you deviate a little from the examples in the article GPT-4 gets it all wrong:Who is the eighth Federal Chancellor of the Federal Republic of Germany? - Olaf Scholz (wrong, Angela Merkel)<a href="https://chat.openai.com/share/937795ea-bd91-43ee-bd76-1a125f9de3f2" rel="nofollow noreferrer">https://chat.openai.com/share/937795ea-bd91-43ee-bd76-1a125f...</a>Who was the eighth Federal Chancellor of the Federal Republic of Germany? - Helmut Kohl (wrong, Angela Merkel)<a href="https://chat.openai.com/share/937795ea-bd91-43ee-bd76-1a125f9de3f2" rel="nofollow noreferrer">https://chat.openai.com/share/937795ea-bd91-43ee-bd76-1a125f...</a>Interestingly whether you put is or was into the question does make a difference.Even when allowed to surf the web, ChatGPT gets it wrong:Who was the eighth Federal Chancellor of the Federal Republic of Germany? - Gerhard Schröder (wrong, Angela Merkel)<a href="https://chat.openai.com/share/867cb3bc-f642-4d2a-b80d-fe8dc00eef3c" rel="nofollow noreferrer">https://chat.openai.com/share/867cb3bc-f642-4d2a-b80d-fe8dc0...</a>Though it's referring to the right Wikipedia page: <a href="https://en.m.wikipedia.org/wiki/List_of_chancellors_of_Germany" rel="nofollow noreferrer">https://en.m.wikipedia.org/wiki/List_of_chancellors_of_Germa...</a>

评论 #38273848 未加载

评论 #38276877 未加载

评论 #38274975 未加载

评论 #38274921 未加载

评论 #38275214 未加载

Imnimoover 1 year ago

>If you start a query with “Mary Lee Pfeiffer”, you’re not going to get very far because neural networks aren’t equidistant grids of points (besides the fact that she may not appear very often under that version of her name.) They’re networks of nodes, some with many connections, some with few. One of the ways you optimize large models is by pruning off weakly connected regions. This may come at the expense of destroying B is A relationships for weakly represented entities.Am I missing something here? This paragraph reads like complete gibberish to me.Also, I don't buy the experiment at the end. If you fine-tune the model with exclusively Tom Cruise data, I want to see proof that it doesn't just answer "Tom Cruise" all the time. I want to see that it says Tom Cruise wrote Aces in the Stream, but doesn't say Tom Cruise wrote Kings in the River.

评论 #38273240 未加载

评论 #38272866 未加载

评论 #38272926 未加载

评论 #38276428 未加载

zarzavatover 1 year ago

Humans are also vulnerable to the reversal curse! When you learn languages you have to learn both directions (chat is cat and cat is chat), anybody who has built an anki deck will know this, otherwise you will be better in one direction than the other.

评论 #38274340 未加载

评论 #38272829 未加载

skaover 1 year ago

> Saying that models can’t automatically generalize from B to A when B is vastly underrepresented in the dataset feels rather obvious and not so much a curse as a description of how neural nets functionMore importantly perhaps, it's not how people (at least non AI/ML research types) typically think they work, and much of the handwaving and hype around it isn't helping improve that.

评论 #38271996 未加载

cratermoonover 1 year ago

If I'm reading this correctly, the author is saying that it's not a failure of logical deduction if the training data doesn't include the reversal. In other words, he's saying that if the data contains "Tom Cruise is the son of Mary Lee Pfeiffer” but not “Tom Cruise’s mother is Mary Lee Pfeiffer”, then the model's inability to determine the latter is "an explanation of how neural networks function than a model’s inability to deduce B is A."But of course "how neural networks function" is that they fail at basic logical deduction and do not generalize.So again, if I'm reading it correctly, he's hand-waving away the inability to make basic logical deductions because that not something they can or should be expected to do. As I read it, that means the reversal curse only exists if the answer to the question "can LLMs do logical deduction?" is "yes". If one takes the position that LLMs can't do general logical deduction, which seems to be the author's point of view, then there's no expectation that knowing "Tom Cruise is the son of Mary Lee Pfeiffer" is sufficient to determine “Tom Cruise’s mother is Mary Lee Pfeiffer”.Am I missing something?

评论 #38272907 未加载

bjornsingover 1 year ago

I didn’t read the whole thing, but the first part about Tom Cruise and his mother sounded very flawed: Of course the LLM could learn the reverse relationship if it had better data where the reverse relationship is a common occurrence. The point of the reversal curse argument (I guess) is that it should be able to learn the reverse relationship from entirely other examples [1], but seemingly is not.1. That is, the LLM should be able to learn that “A is son of B who is female” implies that “B is mother of A”, regardless of who A and B is. It should then be able to apply this pattern to A = “Tom Cruise” and B = “Mary Lee Pfeiffer” and deduce “Mary Lee Pfeiffer is the mother of Tom Cruise” without even a single example.

评论 #38277240 未加载

akoboldfryingover 1 year ago

Only skimmed this and didn't read the underlying paper, but I was surprised to see no mention of the fact that "A is B" often does not at all imply "B is A" in everyday language: A bird is an animal, but it's wrong to conclude that an arbitrary animal must be a bird.

评论 #38272343 未加载

评论 #38275998 未加载

评论 #38272338 未加载

godelskiover 1 year ago

I didn't know anyone was even suspicious of the reversal "curse". I thought it went viral more because people were surprised anyone was surprised and the tweet was highly sensationalized. It feels more like the __expected__ results considering both speech patter bias __and__ that "causal" attention is sequentially biased. Hell, we saw the same things in RNNs, we see it in classic autoregressive models, and so on. But it definitely isn't how humans encode information because if you had all 3 pieces of knowledge (you know who Tom Cruse is, you know who Mary Pfieffer is, and you know the relation between Mary and Tom is mom/son) then your ability to recall is (nearly) invariant to the ordering. Hell, we're so robust you can caveman speak like "Mary Pfieffer son who!" and still get the answer or caveman yoda speak "Who Mary Pfieffer son is?" Some languages even have these orderings but we're impressively robust (enough that I think we trick ourselves a lot when subtlety comes into play. AKA overfitting).So I find it weird to call this "a feature" and also act like this is surprising.But can I also take a minute to just say I really hate GPT experiments? They're performed on a stochastic model, with proprietary weights, proprietary training data, proprietary training methods, and above all is constantly changing and at a rather fast pace. It makes for a very non-scientific process as you can't decouple a lot of important factors and reproduction is a crap shoot. It is not a very good way to go about studying "how __LLMs__ work" and is rather "how does GPT work at this particular moment in time and aggregating all these unknowns?" There's some bitter sweetness because I do like that GPT is free but it feels like a major edge that they have is that the community just does a lot of free research for them and in ways where they can better interpret results than the people who performed the experiments in the first place. I really do believe that academic works shouldn't be focusing on proprietary and dynamic methods. It's fine to include them in results (and preferably added post review to avoid identity spoilage (or we openly acknowledge that double blind is a joke)) but I'd rather most researchers focusing on the general concepts and with the ability to dive down the rabbit hole than playing a wack-a-mole game.Also, I'd totally love it if more research papers were instead blog posts. Kudos to anyone posting their research on blogs, academic or not (I don't care about your creds, your real creds are your work). Papers are about communicating with fellow scientists, right? Why do we need journals and conferences these days?

Applejinxover 1 year ago

This is great work. Andrew's on to something. Thing is, the behavior we see in his form of the LLM is also what we experience as humans.Good writers, marketers and manipulators know this about people. You seed the ground with the concepts you'll need to introduce, so they're present in the person's 'model' and you can elicit them later, on request.Formalizing things in the 'reversal curse' manner is like loading the 'mind' of the LLM with habits and assumptions and cutting off its ability to free-associate from seemingly relevant concepts… which is likely to be more valuable in the long run, because an LLM can contain more than a human can. That doesn't mean it will be more intelligent, but it seems reasonable to infer that the LLM can have a broader base of association to draw from, where we as humans tend to be restricted to associations from our own experience. We only get one shot at 'training data', though it's in countless sensory forms, where LLMs are stuck with language as their only window onto experience.I'm loving the notion of leaving the prompt 'training' stark raving blank. Let's see what comes out of this giant pile of human verbal associations. It is only that, associations, but it's on a grander scale than we're accustomed to. Making it 'answer questions correctly' seems woefully unambitious.

numeriover 1 year ago

This post isn't a scientific investigation. It's someone playing around with a black-box model with few to little controls – which is unfortunately the only thing we can do when experimenting with GPT-4, which does excuse this partially.Unfortunately, though, the author seems unaware of the actual state of research on the actual mechanics of how LLMs store knowledge and specifically binary relations. The ROME paper[1], among others, shows that the feed-forward layers function as a key-value store, where the feed-forward's up projection of the last token in a noun phrase (say, "the Eiffel Tower") acts as a key, which when multiplied by the down projection, produces a value that contains information the model knows about the subject, which is then added into the residual stream/hidden representation.A paper building on that work[2] then went on to show that it's usually the self-attention layers that use the relational phrase (say, "is in") to extract the relevant knowledge from the feed-forward layer's output (in this example, hopefully "Paris").This mechanistic understanding makes it really obvious why the reversal curse occurs – using matrix multiplication as a key-value store requires having a fully separate key-value pair to look up the reversed relation.[1] <a href="https://arxiv.org/abs/2202.05262" rel="nofollow noreferrer">https://arxiv.org/abs/2202.05262</a> [2] <a href="https://arxiv.org/abs/2304.14767v1" rel="nofollow noreferrer">https://arxiv.org/abs/2304.14767v1</a>

weinzierlover 1 year ago

"As a side note: I want to point out that I’m not aware of any examples of capabilities that can be done with prompting a model like GPT-4 that it can’t be trained for. This is why I’m a little skeptical."While it is just a side-note in the article, isn't this the core of the problem? When we've already established that LLM's can do B to A generalizations in-context, why wouldn't they be able to in training?In one of my experiments I noticed that GPT-4 seems to be perfectly aware of the number of letters in a word in-context, but has difficulty when trained knowledge is involved.For example it can reliably answer the question:"Can you tell me how many letters each of the words of the first sentence of our conversation has?"At the same time it fails with the task:"Can you rewrite the first sentence of our conversation in a way that preserves its meaning as closely as possible but use only words with an even number of letters?"It will give an answer but get the letter counts very wrong and it is unable to improve its answer by iteration.Of course this does not prove that a model cannot be trained to answer tasks involving word length, but GPT-4 seems to have a knowledge gap here (possibly due to tokenization).

thomastjefferyover 1 year ago

Right at the opening, we have a fundamental (and common) misunderstanding of what LLMs are.When humans read the statement "A is B", we semantically transform that into a logical association. LLMs do not perform any semantics or logic.Here's a simple example to demonstrate:If we trained an LLM on something like "A is B C is D D is C.", we might be able expect the continuation, "B is A". If we then gave that LLM the prompt, "What is B?", we might expect the continuation, "B? is What".Large models like GPT present more interesting continuations because they are trained on larger and more diverse datasets. More diversity also means more ambiguity, which results in continuations that are less predictable and more illogical.

maaaaatttttover 1 year ago

The Olaf Scholz exemple from this article is just another exemple of how LLMs can’t count. If you try this prompt: “In this list of words: bike, apple, phone, dirt, tee, sun, glass; which is the fifth word?” it will fail as well. “Fifth” is not connected to any counting ability in LLMs the way it is for us.If you now try this prompt: “Who’s Tom Cruise’s mother in this exemple: “Mary Lee Pfeiffer is Tom Cruise’s mother.”?” It will give you the right answer.IMO this is a sign that it can understand reversibility.The exemple used are not present enough in the dataset and the “confidence” of the model is not high enough for it to give the right answer. Or, as stated at the beginning they use counting mechanisms (or others) that the model just doesn’t have.

评论 #38274346 未加载

评论 #38274298 未加载

ShamelessCover 1 year ago

The article is a decent peer review and refutation of “the reversal curse”. Some of the comments given here clearly haven’t read the whole article though - arriving at similarly skeptical conclusions that are clearly present and expanded on in the article.Why do people feel the need to do this here? Armchair commentary on advanced material is one of the main reasons I avoid Reddit. And furthermore why does it feel like you’re not allowed to suggest this as a response? I should be able to say “RTFA” but here I feel like I’m going to be scolded or banned by moderation.

评论 #38272277 未加载

MPSimmonsover 1 year ago

I had a discussion with ChatGPT 4 on this (quasi-ironically) to see what it "thought", and it theorized that BERT might have had a better outcome to these conditions because it's actually bidirectionally trained, rather than GPT, which is forward-trained.<a href="https://chat.openai.com/share/daff08c5-ddea-4ad4-963a-0df88eddc8e1" rel="nofollow noreferrer">https://chat.openai.com/share/daff08c5-ddea-4ad4-963a-0df88e...</a>

smusamashahover 1 year ago

I don't agree with authors claim.There is a plant which mimics leaves of nearby plants discussed here <a href="https://news.ycombinator.com/item?id=31301454">https://news.ycombinator.com/item?id=31301454</a> If you ask GPT-4 what this plant is known for, it will tell you correctly. But if you ask in any number of ways to tell the name of plant which mimic leaves, it will always give incorrect answer.

mitjamover 1 year ago

This is a great article with more tips on how to prompt and fine tune LLMs than most articles that focus on this topic. Also has surprising insights which are food for thought. Like: Would it help to attach facts to well-learned „nodes“ (like Tom Cruise in his example) when fine tuning? In the sense of repurposing well-known nodes. Looking forward to reading more from Andrew.

foldrover 1 year ago

Fodor's argument from systematicity and productivity strikes again: <a href="https://plato.stanford.edu/entries/language-thought/#ArguProdThou" rel="nofollow noreferrer">https://plato.stanford.edu/entries/language-thought/#ArguPro...</a>An interesting reflection on the state of the field that the paper doesn't even cite him.

Obscurity4340over 1 year ago

Is it a possibillity that certain aspects of logic or human communication have a secret sauce that is not reliably replicable? I wonder if its expected to basically be able to perfectly interface with humans or logicical structures that flow from said imperfect beings, who are themselves deeply imperfect and irrational

montebicycleloover 1 year ago

Related; 4x improvement in accuracy for experiment 2 of this paper, (with gpt-3.5-turbo), by changing the prompt:<a href="https://sidsite.com/posts/reversal-curse/" rel="nofollow noreferrer">https://sidsite.com/posts/reversal-curse/</a>Shows what a dramatic effect the prompt can have

sgt101over 1 year ago

We are in word games here about zero shot learning.The fact is that with zero shot the models actually aren't learning in the sense of updating their knowledge structure, they are instead using background knowledge as inference. We have started calling it zero-shot learning but... well it isn't really.

pmarreckover 1 year ago

I couldn't duplicate it in GPT4 here. It answered correctly, unless I posed it wrong:<a href="https://chat.openai.com/share/75d46a03-a223-4f3a-987d-f8fec3af1ca5" rel="nofollow noreferrer">https://chat.openai.com/share/75d46a03-a223-4f3a-987d-f8fec3...</a>

评论 #38272449 未加载

Zoldeover 1 year ago

Luckily this curse has no significant practical impact on accuracy.Who knows who Mary Lee Pfeiffer is, but does not already know her son is Tom Cruise?And if such a person exist, do you want to give them a correct answer, or talk about how Mary Lee Pfeiffer is not a notable person with a Wikipedia page?

dr_dshivover 1 year ago

> So in summation: I don’t think any of the examples the authors provided are proof of a Reversal Curse and we haven’t observed a “failure of logical deduction.” Simpler explanations are more explanatory: imprecise prompts, underrepresented data and fine-tuning errors.Phew!

gololover 1 year ago

Excellent blog post. The only thing which I wish was there was an explanation of how promot + completion training would be any different to pure completion training. From my understanding oftransformers, it ahould be the same.

smitty1eover 1 year ago

> Regardless, we can see that GPT-4 can easily go from B to A in that example when the question is posed unambiguously.Ambiguity is the soul of humor and the sole on the neck of any booted computer.

ugh123over 1 year ago

This seems like the kind of thing that would be addressed during implementation, rather than persevere as a fundamental problem with LLMs.

riwskyover 1 year ago

“I’m going to define generalization such that my thesis that GPT-4 can generalize reversals is true—and lo, my thesis is true!”

xeonaxover 1 year ago

Isn't this solvable by creating training data that says B is A? Instead of always training it on A is B.

RamblingCTOover 1 year ago

I'm not sure about this. Olaf Scholz has been chancellor since 2021 and the cut off for chatgpt 4 is allegedly January 2022.My gut also tells me that the underlying embeddings should very well reflect equidistant relations of olaf scholz and chancellor./e: instead of downvoting, I'd love to have your opinion instead on why you think I'm wrong ...

diffeomorphismover 1 year ago

Well, yeah because the reversal is just completely wrong most of the time.Only if you have additional context clues like definite articles or some outside knowledge that a role or property is unique, then you can maybe, sometimes conclude the reverse. Not a bug, working as intended.

评论 #38273993 未加载

octocopover 1 year ago

I found an example of this with chatGPT 3.5> Who is Karen Meyers son?vs> Who is Mac Millers mother?

user_namedover 1 year ago

I mean obviously yes. LLMs aren't intelligent and they don't understand anything.OP is trying to argue against this but it's nonsense. ChatGPT also cannot tell me who the 8th chancellor was. It tells me it was Gerhard Schröder.

评论 #38274236 未加载

评论 #38274168 未加载

Khelavasterover 1 year ago

You have to add context the relationship of A to B isaintained under reflexive algebra..

jmountover 1 year ago

This is really dreary.

willsoonover 1 year ago

Oh Cantor, Cantor,bitter pale Cantor,you pale ascetic,your twist was epic,so I live in silence,in sadness I crave,all for that book,you wrote and I read.