Build your own agents which are controlled by LLMs

218 pointsby mpaepperabout 2 years ago

11 comments

cube2222about 2 years ago

Cool stuff. I think agents are the most promising and potentially revolutionary part of the current LLM developments. Moreover, they're quite easy to write yourself, you don't need something like Langchain, the prompt structure is simple and it's a bit of text templating (it's all based on the ReAct paper[0]).That said, the bit about "The trick to avoid hallucination" in the attached blog post is a very neat idea. Obvious in hindsight, as it often is:> So the trick is that we send a stop pattern which in this case is when we see Observation: in it’s output, because then it has created a Thought, an Action and used a tool and hallucinates the Observation: itself :D> This stop parameter is a normal parameter of the OpenAI API by the way, so nothing special to implement there.In general, it's magical when you put these things in a feedback loop. The way they can automatically respond to errors and adjust their actions is really cool - take a look at the last gif here[1].[0]: <a href="https://arxiv.org/abs/2210.03629" rel="nofollow">https://arxiv.org/abs/2210.03629</a>[1]: <a href="https://github.com/cube2222/duckgpt">https://github.com/cube2222/duckgpt</a>

summarityabout 2 years ago

I did similar experiments with GPT4 and Soulver[1], though I "tuned" it by teaching it Soulver interactively before continuing to prompt. It can then be used in the same way. My primary goal was to add basic calculation capabilities to GPT, that are: a) guaranteed to halt (all Soulver sheets are functions) and b) readable by both GPT and the user (a Python program may be too dev-oriented for normal people).It worked quite well. To well almost: I started a meta-conversation where I asked another GPT4 instance to come up with conversations SoulverGPT could have with a user where the addition of solving is beneficial. This worked, and eventually even found a bug in Soulver - essentially fuzzing the language.[1] <a href="https://github.com/soulverteam/SoulverCore">https://github.com/soulverteam/SoulverCore</a>

评论 #35447484 未加载

mkmkabout 2 years ago

“ The agent runs in a loop of Thought, Action, Observation, Thought, ... The Thought and Action (with the Action Input to the action) are the parts which are generated by an LLM The Observation is generated by using a tool (for example the print outputs of Python or the text result of a Google search)”Really neat to see the parallels to OODA loops, which are a frequent feature (implicit or explicit) of human decision-making processes.

评论 #35449221 未加载

评论 #35447258 未加载

jamesfisherabout 2 years ago

This makes so much more sense than langchain! I read all the docs for langchain, but the abstractions seemed broken and it wasn't obvious what was happening at the LLM level. In contrast, I grokked the entire llm_agents API within 5 minutes.

评论 #35450511 未加载

skybrianabout 2 years ago

A tool like this with a single-step mode (a UI to preview what it will do before it does it) might be nice. Ideally you could collect traces of a repetitive task and eventually turn the traces into a script that doesn't need the LLM, and therefore isn't so expensive to run.

评论 #35449510 未加载

furyofantaresabout 2 years ago

This is cool. I haven't seen any good/useful agents built using GPT4 yet, doesn't seem for lack of trying, but maybe some good libraries for building them will help. There's some good ideas in this one.That said, while I am very, very impressed with GPT4 for lots of uses right now, so far it's just not clear to me that feeding it back into itself is fruitful at this point.When I use GPT4 for coding, I give it MUCH higher level instructions than I use on a search engine, much closer to the actual problem I'm solving. But I'm still breaking the problem down into smaller problems; I need to read the output, fix errors or instruct it to fix errors, and then have it build more features on. It's similar with creative processes, brainstorming, and other writing.These agents largely strike me as an attempt to replace this whole fact-checking/editing type routine with the LLM itself; but seeing as it's the thing the LLM is not yet good at, I'm not sure how much progress can be made there, vs just waiting for GPT5 and hoping it's another big leap in capabilities.

评论 #35447573 未加载

评论 #35447692 未加载

eshnilabout 2 years ago

I built this puppeteer-based agent to do online tasks in browser, described in natural language: <a href="https://github.com/aicombinator/bot">https://github.com/aicombinator/bot</a>

评论 #35448736 未加载

joshcamabout 2 years ago

Works great thank you for sharing! Faster than Langchain on baisc stuff so far and I do love the simple nature of it. Might make a Reddit tool real quick and a PR for it.<a href="https://imgur.com/a/kJvjJH3" rel="nofollow">https://imgur.com/a/kJvjJH3</a>If I had GPT-4 API access, it would have found that on the first try. Sigh.--I've noticed it does skip the thought process sometimes.Question: Who is the president? What year whas he born. Name one other famous person born in that year. Thought:Final answer is The current president of the United States is Joe Biden, born in 1942. Some other famous people born in the same year include Harrison Ford, Muhammad Ali, and Aretha Franklin.

评论 #35449543 未加载

wjessupabout 2 years ago

What all these tools need to adopt is sending 10-20 requests out and finding the "best" response. I think it's incorrect that we try to get the tool to work right the first time. Auto-GPT has JSON parse errors 20-50% of the time. Instead, with enough parallel responses we can increase the likelihood one of them is "really good". The next challenge is figuring out which answer is really good and continuing with that.

评论 #35449837 未加载

评论 #35449643 未加载

评论 #35456564 未加载

antiatheistabout 2 years ago

I started naively implementing something similar in a project before reading more about TOAT loops and langchain.Though the project makes a decent GUI to use ChatGPT if anyone is interested: <a href="https://github.com/blipk/gptroles">https://github.com/blipk/gptroles</a>You can run and edit code snippets in the chat interface

qaqabout 2 years ago

It feels that with c2 layer that let's LLM mutate the c2 layer we are soon will be entering AGI land

评论 #35447416 未加载