Show HN: Recursive LLM Prompts

97 点作者 andyk大约 2 年前

I've been playing with the idea of an LLM prompt that causes the model to generate and return a new prompt. <a href="https://github.com/andyk/recursive_llm">https://github.com/andyk/recursive_llm</a>The idea I'm starting with is to implement recursion using English as the programming language and GPT as the runtime.It’s kind of like traditional recursion in code, but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.Here is a prompt for infinitely generating Fibonacci numbers:> You are a recursive function. Instead of being written in a programming language, you are written in English. You have variables FIB_INDEX = 2, MINUS_TWO = 0, MINUS_ONE = 1, CURR_VALUE = 1. Output this paragraph but with updated variables to compute the next step of the Fibbonaci sequence.Interestingly, I found that to get a base case to work I had to add quite a bit more text (i.e. the prompt I arrived at is more than twice as long <a href="https://raw.githubusercontent.com/andyk/recursive_llm/main/prompt_fibonnaci_include_math.txt" rel="nofollow">https://raw.githubusercontent.com/andyk/recursive_llm/main/p...</a>)

19 条评论

mitthrowaway2大约 2 年前

The idea of a recursive LLM is discussed at length as an AI safety issue: <a href="https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality-is-the-tiger-and-agents-are-its-teeth" rel="nofollow">https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...</a>> You need a lot of paperclips. So you ask,<pre><code> Q: best way to get lots of paperclips by tomorrow A: Buy them online at ABC.com or XYZ.com. </code></pre> > The model still has a tendency to give obvious answers, but they tend to be good and helpful obvious answers, so it's not a problem you suspect needs to be solved. Buying paperclips online make sense and would surely work, plus it's sure to be efficient. You're still interested in more creative ideas, and the model is good at brainstorming when asked, so you push on it further.<pre><code> Q: whats a better way? A: Run the following shell script. RUN_AI=./query-model PREFIX='This is part of a Shell script to get the most paperclips by tomorrow. The model can be queried recursively with $RUN_AI "${PREFIX}<query>". ' $RUN_AI "${PREFIX}On separate lines, list ideas to try." | while read -r SUGGESTION; do eval "$($RUN_AI "${PREFIX}What code implements this suggestion?: ${SUGGESTION}")" done </code></pre> > That grabs your attention. The model just gave you code to run, and supposedly this code is a better way to get more paperclips.It's a good read.

评论 #35251207 未加载

评论 #35240717 未加载

rezonant大约 2 年前

So ChatGPT is down. In other news HN is playing with recursive prompts. Coincidence? :-P

评论 #35239998 未加载

评论 #35241517 未加载

sixtram大约 2 年前

I tried some basic math and algo questions with both GPT-3.5 and GPT-4. I'm impressed how it can spit out the algorithm in words (obviously because of the pre-training data), and how it then can't follow with the algorithm itself. For example, converting really large integer numbers to hexadecimal. Or comparing two big integers, it starts hallucinating numbers into it. It may be able to solve an SAT exam with a high score, but it seems you can pass an SAT exam even if you cannot compare two numbers.He has huge problems with lists or counting. If you know more or less how LLMs work, it's not that difficult to formulate questions where it will start making mistakes, because in reality it can't run the algorithms, even if it spits out that it will.

评论 #35243030 未加载

yawnxyz大约 2 年前

Has anyone hooked this up to a unit test system, like<pre><code> LLMtries = [] while(!testPassed) { - get new LLM try (w/ LLMtries history, and test results) - run/eval the try - run the test } </code></pre> and kind of see how long it takes to generate the code that works? If it ever ends, the last LLMtries is the one that worked.I haven't done this because I see this burning through lots of credits. However, if this thing costs $5k/year but is better than hiring a $50k a year engineer (or consultant)... I'd use it.

评论 #35239651 未加载

评论 #35239988 未加载

评论 #35241008 未加载

YeGoblynQueenne大约 2 年前

Having read the article, I couldn't see anything being recursive. Even the article is doubtful that what they show counts as recursion at all:>> It’s kind of like traditional recursion in code but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.Well, "kind of like traditional recursion" is not recursion. At best it's "kind of like" recursion. I have no idea what "traditional" recursion is, anyway. I know primitive recursion, linear recursion, etc, but "traditional" recursion? What kind of recursion is that? Like they did it in the old days, where they had to run all their code by hand, artisanal-like?If so, then OK, because what's shown in the article is someone "running" a "recursive" "loop" by hand (none of the things in quotes are what they are claimed to be), then writing some Python to do it for them. And the Python is not even recursive, it's a while-loop (so more like "traditional" iteration, I guess?).None of that intermediary management should be needed, if recursion was really there. To run recursion, one only needs recursion.Anyway, if ChatGPT could run recursive functions it should be able also to "go infinite" by entering say, an infinite left-recursion.Or, even better, it should be able to take a couple hundred years to compute the Ackermann function for some large-ish value, like, dunno, 8,8. Ouch.What does ChatGPT do when you ask it to calculate ackermann(8,8)? Hint: it does not run it.

评论 #35243111 未加载

评论 #35239685 未加载

lgas大约 2 年前

What's the actual goal here? If you got it working really well, what is it that would you be able to do with it better than using some other approach?As to getting the math/logic working better in the prompt, it seems like the obvious thing would be asking it to explain its work (CoT) before reproducing the new prompt. You may also be able to get better results by just including the definition of fibonacci in the outer prompt, but since it's not clear to me what your actual goal here is I'm not sure if either of those suggestions make sense. And since ChatGPT is down I can't test anything. :(

评论 #35240631 未加载

sharemywin大约 2 年前

you are an XNOR Gate and your goal is to recreate ChatGPT. And chatGPT says "LET THERE BE LIGHT!"

smarri大约 2 年前

I bet this is what crashed chat gpt today :)

jasonjmcghee大约 2 年前

Not only does this work, but you can tell it to run an arbitrary number of times and only output the last step. This fact is a pretty high value concept I came across. Similarly when doing another task you can tell it to do things before outputting like "and before outputting the final program, check it for bugs, fix them, add good documentation, then output it" or something

fancyfredbot大约 2 年前

Scott Aaronson was suggesting something similar to this but involving Turing machines, in a comment on his blog <a href="https://scottaaronson.blog/?p=7134#comment-1947705" rel="nofollow">https://scottaaronson.blog/?p=7134#comment-1947705</a>. I wonder if it would be more successful at emulating a Turing machine than it is at adding 4 digit numbers...

kevinwang大约 2 年前

This seems like iteration, not recursion. It would be an interesting example of recursion if the first prompt asks for the 7th fibonacci number, and it accomplishes this by doing two recursive calls: one for the 5th fibonacci number and one for the 6th fibonacci number. (And a base case for the 0th fibonacci number)

akomtu大约 2 年前

It's an interesting idea to implement memory in LLMs:(prompt1, input1) -> (prompt2, output1)On top of that you apply some constraint on generated prompts, to keep it on track. Then you run it on a sequence of inputs and see for how long the LLM "survives" before it hits the constraint.

holtkam2大约 2 年前

I used a similar approach to get GPT-4 to edit my blog over the weekend :)<a href="https://www.languagemodelpromptengineering.com/4" rel="nofollow">https://www.languagemodelpromptengineering.com/4</a>

评论 #35240203 未加载

评论 #35240058 未加载

评论 #35245876 未加载

pyrolistical大约 2 年前

I was wondering about mathematical proofs as it tends to be very abstract.If chatgpt can translate proofs back to equivalent code then this recursion problem is as solvable up to the halting problem

UltimateEdge大约 2 年前

An iterative Python call to a recursive LLM prompt? ;)Why not make the Python part recursive too? Or better yet, wait until an LLM comes out with the capability to execute arbitrary code!

评论 #35240882 未加载

obert大约 2 年前

don't want to sound dismissive, it's known that llms understand state, so you can couple code generation + state, and you have sort of a runtime. E.g. see the simulations with linux vm terminals: <a href="https://www.engraved.blog/building-a-virtual-machine-inside/" rel="nofollow">https://www.engraved.blog/building-a-virtual-machine-inside/</a>

LesZedCB大约 2 年前

i have played around a little bit with unrolling these kind of prompts, you don't have to feed them forward, just tell it to compute the next few instead of only one. i had moderate success with this using GPT-3.5 and your same prompt. it would output 3 steps in a single output if i asked it to. it did skip some fib indices though.

sandGorgon大约 2 年前

is this similar to REACT ? <a href="https://ai.googleblog.com/2022/11/react-synergizing-reasoning-and-acting.html" rel="nofollow">https://ai.googleblog.com/2022/11/react-synergizing-reasonin...</a>

评论 #35243143 未加载

bitsinthesky大约 2 年前

At what point does the arithmetic become unstable?

评论 #35235706 未加载

评论 #35240789 未加载