GPT-4 “discovered” the same sorting algorithm as AlphaDev by removing “mov S P”

190 点作者 machdiamonds将近 2 年前

18 条评论

danpalmer将近 2 年前

The paper shows several distinct improvements to a sorting algorithm, and presents evidence that the process is generally applicable. This tweet points GPT-4 at 20 instructions and asks if any can be removed, and it finds one optimisation.That's good to see from GPT-4, but the comparison seems disingenuous, and I'd expect more from someone in academia.

评论 #36255553 未加载

评论 #36249329 未加载

评论 #36251236 未加载

评论 #36256908 未加载

评论 #36250215 未加载

评论 #36272183 未加载

评论 #36256642 未加载

cafaxo将近 2 年前

GPT-4's explanation of its optimization does not make sense to me. It writes "Instead of moving it to P, we can directly use S in the following comparisons, saving one instruction." but then proceeds to use P as if that mov had happened.AlphaDev's optimization relies on the fact that B and C are already in the correct order. This precondition is missing from the prompt given to GPT-4. It seems that GPT-4 is hallucinating something that only resembles the correct optimization at first glance.

评论 #36251199 未加载

评论 #36250911 未加载

评论 #36251241 未加载

pfedak将近 2 年前

I posted a reply, but I think ultimately this is just a coincidence. There's a naïve reason that "mov S P" looks redundant ("just use S instead of P later"), but in typical GPT fashion, this is specious, and can't actually be done. It's essentially trying to swap two variables without using a temporary. If x86 had a conditional swap instruction, it could, but it doesn't, and just doing "cmp,cmov,cmov" can't handle it.Another giveaway is that removing that line in the real optimization changes the output of the provided snippet if C < B. It feels like a hard sell to say GPT picked this line for that subtle reason based on information not provided, but explained it with something only correct at surface level.

评论 #36252720 未加载

beisner将近 2 年前

Is it possible that GPT-4 already had this new optimization somewhere in its training set? The optimizing patch DeepMind published on has been floating around for several months at this point…Edit: I found the merges for sorting (Jan 2022) [1] and hashing (Jan 2023) [2]. Both of these are very plausibly in the training set for GPT-4, which was frozen sometime in March 2023.[1] <a href="https://reviews.llvm.org/D118029" rel="nofollow">https://reviews.llvm.org/D118029</a>[2] <a href="https://github.com/abseil/abseil-cpp/commit/74eee2aff683cc7dcd2dbaa69b2c654596d8024e">https://github.com/abseil/abseil-cpp/commit/74eee2aff683cc7d...</a>

评论 #36250536 未加载

评论 #36249248 未加载

评论 #36250549 未加载

awegio将近 2 年前

DeepMind's blog post on AlphaDev says:> AlphaDev uncovered faster algorithms by starting from scratch rather than refining existing algorithmsFinding that specific optimization, especially when given the comments, seems almost trivial by comparison.Edit: I tried to understand the optimization in question. This is not the full sort3 algorithm, but only under the assumption that B < C. In that case the GPT-4 answer is actually wrong because it wasn't given that assumption.

imranq将近 2 年前

This is an impressive demonstration of GPT-4's logic and reasoning capabilities, but the DM result uses a real-world reward signal (e.g. the list is sorted correctly) to validate its results whereas we can never be sure that GPT-4 outputs are hallucinated or not. Also this experiment would never be done if not for the DM paper in the first place.

评论 #36249714 未加载

cypherpunks01将近 2 年前

Wow no reinforcement learning needed! Amazing to think, all they would've had to do instead is spend a decade or two building towards a 1 trillion parameter transformer model and spend 100m or two training it. Then fine-tune the model using.. y'know nevermind : )

bradley13将近 2 年前

Trying to attribute intelligent intent where there is none. ChatGPT just puts tokens in order, following amalgamated examples it has seen.When it screws up, we call it a hallucination. With careful prompting, you can get it to screw up in a way that works.Color me unimpressed.

评论 #36256491 未加载

评论 #36257231 未加载

agluszak将近 2 年前

Can you set ChatGPT's temperature by simply asking it to "use temperature 0.0"? Doesn't sound legit

评论 #36257193 未加载

yieldcrv将近 2 年前

I've gotten extremely odd algorithms and solutions that saved me lots of resources.Lots of bitshifting and unsigned ints and approximations these days. Stuff I would have never thought of, and then I can talk it through how something else only I thought of would be even more applicable, and it refines it for that too! Great pair programmer!This level of competence will never show up in a live no-resource interview or any (watched) time trial I do.

评论 #36254566 未加载

bitshiftfaced将近 2 年前

Since they brute forced all solution programs for this case anyway, Deepmind already admitted that they didn't even need to use AlphaDev in this particular case. GPT-4 being able to work it out is great, but it says nothing about the value of what the AlphaZero/Dev algorithm and the kinds of interesting problems you can solve with it.

summerlight将近 2 年前

Isn't it something like some interviewer who already knows the answer asks some interviewee a very specific question designed with a certain level of guidance in mind? It's still impressive for LLM, but it's not even apple-to-orange comparison.

yb303将近 2 年前

It didn't discover anything. It only minces words it found online up to 2021.

评论 #36255321 未加载

IshKebab将近 2 年前

Very impressive, though you are using a lot of prior knowledge to guide it towards the solution you already know. The RL version presumably wasn't given the hint that it could remove an instruction.Still very impressive, but I'd like to see it work on some other assembly for which the "answer" isn't already known.

world2vec将近 2 年前

The green GPT icon in the chat would indicate it's GPT-3.5 right? GPT-4 is purple, at least for me.

评论 #36251310 未加载

globular-toast将近 2 年前

Isn't it cheating a bit to prompt for removing a line? Not all algorithms can be optimised by removing a single step.

dang将近 2 年前

Comments moved to <a href="https://news.ycombinator.com/item?id=36228125" rel="nofollow">https://news.ycombinator.com/item?id=36228125</a>.

评论 #36252564 未加载

评论 #36252360 未加载

BulgarianIdiot将近 2 年前

I don't like the implications of the quoted "discovered".

评论 #36249377 未加载