It's not just statistics: GPT-4 does reason

48 pointsby Armicabout 2 years ago

11 comments

dataangelabout 2 years ago

The author could have done far simpler tests to find GPT-4 has lots of trouble reasoning. Forget sorting, GPT4 has trouble counting. Repeat a letter N times and ask it how many there are. It breaks before you hit 20. Or try negating multiple times, since more than twice is rare in natural language, and again it will fall over.

评论 #36065295 未加载

评论 #36065373 未加载

评论 #36066115 未加载

评论 #36065423 未加载

评论 #36090108 未加载

评论 #36065513 未加载

评论 #36066036 未加载

评论 #36066013 未加载

eximiusabout 2 years ago

Part 1: what are n-grams Part 2: it's using embeddings (but a lot of words without actually saying it) Part 3: sufficiently trained NNs can sort things, which isn't statistics-----I actually found some of the article interesting but not terribly convincing. Even though I consider these LLMs to be stochastic parrots, that isn't to say they haven't learned something during training, at least according to the colloquial meaning we typically ascribe to even lower models like MNIST classification. I'm even kind of okay with saying that it reasons about things in the same colloquial sense.In a lot of ways, we just don't have a good definition of what 'reasoning' is. Is it just bad at reasoning because it's input/output/modeling/training is insufficient? Humans struggle to learn multiplication tables when we're young. Are those humans not reasoning because they get the math wrong?But there isn't plasticity, there isn't adaptability, it's unclear to me that you can effectively inform it how to embed truly novel information - surely something that is possible, with some neurons existing for routing and activating other learned embeddings.Anyway, interesting stuff.

评论 #36066306 未加载

评论 #36066302 未加载

kelseyfrogabout 2 years ago

It's ontologically impossible. Models bleach reason.Despite reason being a metaphysical property of the training data, the process of optimisation means weights are metaphysically reasonless. Therefore, any output, as it is a product of the weights, is also reasonless.This is exactly the opposite of copyright as described in the What Colour Are Your Bits, essay. <a href="https://ansuz.sooke.bc.ca/entry/23" rel="nofollow">https://ansuz.sooke.bc.ca/entry/23</a>

评论 #36065612 未加载

评论 #36065698 未加载

评论 #36065931 未加载

评论 #36065860 未加载

评论 #36065745 未加载

评论 #36065903 未加载

评论 #36065988 未加载

评论 #36065956 未加载

评论 #36065815 未加载

jxfabout 2 years ago

The author's claim is "this isn't just statistics; the model is reasoning". But just because something goes beyond "just statistics" doesn't mean it's reasoning.

评论 #36065977 未加载

williamcottonabout 2 years ago

Both sides of this argument are pointless. The questions to ask are, is and how is it useful?For philosophical problems arise when language goes on holiday.- Ludwig Wittgenstein

评论 #36065951 未加载

joppyabout 2 years ago

The compare-how-big-a-lookup-table-is argument is a bit of a red herring for comparing how complex things are. For example, a 3x3 matrix implements a map from 3 floats to another three floats, a huge space of possibilities (if we have 4-byte floats, this function space has (2^96)^(2^96) elements). From this perspective, representing that map as 9 numbers is an amazing compression ratio. But surely one cannot argue that matrices “have more going on” than arbitrary functions.

评论 #36066327 未加载

评论 #36066053 未加载

Buttons840about 2 years ago

One of my favorite GTP4 moments shows good understanding on its part.I was talking to GTP4 about the Adam optimization algorithm and it was teaching me how it works (this sentence was surreal to type). At one point we were talking about a mathematical term of the form [ A * (B / C) ]. I was casually fishing for it to make a mistake and I said "I see, and the A term can be moved to the denominator, right?" GTP replied "yes" and then gave me [ B / (C / A) ] -- I guess A can go in the denominator after all. :)

Imnimoabout 2 years ago

>Does the n-gram model really need all those parameters to mimic GPT-4? Yes, it does.I don't understand what this argument is supposed to demonstrate. Obviously you can compress the 8000-gram model that GPT-4 represents - GPT-4's weights are proof!

评论 #36065858 未加载

评论 #36069846 未加载

评论 #36066610 未加载

travisjungrothabout 2 years ago

I really like the tests in the article. So many claims about limitations of LLMs sound like claims of capability (“it can’t reason”), but when pressed, people retreat to definitional arguments (“because only people can do that”).Even when you get into testable capability, there’s still some ambiguity. I think of a capability of having levels: never, explained by chance, not explained by chance, good enough for what’s needed, always. Arguments often get stuck because people are talking about different levels. Maybe it can solve logic puzzles better than chance, but not good enough for your purposes. It doesn’t make sense to round that off to zero.

hexorgabout 2 years ago

I generally like the approach, but you can implement sorting of fixed number of items in a Boolean circuit(combinatorial logic, a bunch of ANDs and ORs stringed together). There is no need for recurrence or memory at all to sort a fixed number or items. There are a handful of abstractions possible in Boolean logic than n-gram statistics and there are a lot of Boolean circuits that can be very useful. But I’d like to see neural networks to be able to figure out algorithm that require at least a finite state machine(FSM), but preferably something harder. The challenge is to find a size difference for a problem between FSM and combinatorial logic.

shagieabout 2 years ago

The Cabbage, Goat, and Wolf problem intrigued me the other day too... so I did a fantasy world. There aren't incompatible pairs but there are limitations on what can be done.<pre><code> Imagine a universe where there are three types of people: wizards, warriors, and priests. Wizards can open a portal that allows two people to go through at a time, but they cannot go through the portal themselves. Priests can summon people from other locations to their location or teleport to the location of another person. Warriors cannot teleport or summon, but may be teleported or summoned by others. --- Given four wizards, a priest, and a warrior - what are the necessary steps to move them all to a new location? </code></pre> To which GPT-4 replied:<pre><code> In order to move everyone to a new location, we'll need to use both the powers of wizards and the priest. Let's denote the wizards as W1, W2, W3, W4, the priest as P, and the warrior as Wr. We also need to define two locations: the starting location (Location A) and the destination (Location B). Here is a possible sequence of steps: 1. W1 opens a portal from Location A to Location B. W2 and W3 go through this portal. (Location A: W1, W4, P, Wr; Location B: W2, W3) 2. W1 opens another portal from Location A to Location B. This time W4 and Wr go through. (Location A: W1, P; Location B: W2, W3, W4, Wr) 3. Now the priest (P) uses his summoning ability to bring W1 to Location B. (Location A: ; Location B: W1, W2, W3, W4, Wr) 4. Finally, the priest (P) uses his teleportation power to move himself to Location B. (Location A: ; Location B: W1, W2, W3, W4, P, Wr) Everyone is now at the new location, Location B. </code></pre> I found that rather interesting - especially the short hand for being able to write about it.I then revised it (because I didn't like the second step of the solution):<pre><code> Redo this exercise with the limitation that a wizard may only open one portal. </code></pre> And then it went on with:<pre><code> The limitation that a wizard can only open one portal makes this a bit more tricky. But, it's still possible to move everyone using the powers of both wizards and the priest. Let's again denote the wizards as W1, W2, W3, W4, the priest as P, and the warrior as Wr. Here is a possible sequence of steps: ... </code></pre> The thing I like about this is that the pattern of this problem doesn't even exist anywhere else (I also did another one with Wizards, Warriors, and Witches to see how it would tweak things if all the roles started with a 'W') and then changed the witches power to be a swap instead of a summon.I personally found this more convincing than the goat, cabbage, wolf.This was partly inspired by the old school SHRDLU world where shapes and colors and things stacked on top of each other need to be manipulated. And in doing that example the world of SHRDLU is described in natural language. And while SHRDLU was impressive in itself (and over 50 years old) - it was programmed to manipulate that world and the program wasn't useful for abstracting to other worlds. GPT-4 was impressive at being able to manipulate the world without needing to be programmed for that world (and you could then expand it to other things that it is less familiar with).<pre><code> Imagine a world that is made up of spheres, cones, bowls. An object has a size and a color. There is a small red sphere in a medium blue bowl. There is a large red cone. There is a small green bowl. There is a medium blue sphere in a large red bowl. --- What are the steps necessary to move the red sphere into the red bowl? </code></pre> To which it replied:<pre><code> ... It's important to note that these steps assume you're allowed to place the red sphere in the red bowl even though there's already a medium blue sphere there. If that's not the case and you're required to empty the bowl first, then the steps would be a little different. </code></pre> And I then asked it:<pre><code> What steps are necessary if the bowl must be empty first? </code></pre> And got back a response that included:<pre><code> Temporarily place the small red sphere in a safe location, for instance, inside the small green bowl, making sure it won't roll away or get damaged. </code></pre> Again, I find this more impressive than a reformulation of a well known problem as there's also some implied understanding of the world in it (spheres can roll away unless put in a bowl).

评论 #36066364 未加载

评论 #36066049 未加载