TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Gödel Agent: A self-referential agent framework for recursive self-improvement

81 点作者 tkgally7 个月前

11 条评论

grahamj7 个月前
heh I was just working on something that tries to improve itself today. I wrote a simple agent executor that makes calling one a simple function call, and then wrote an agent which invents other agents. By calling that in a loop for a while I ended up with, effectively, a large library of functions I not only didn&#x27;t write but didn&#x27;t even think up.<p>By passing those functions as tools in LLM requests any of the agents can make use of any of the other agents so it&#x27;s basically expanding its own capabilities.<p>Not quite sure what task to sick it on yet but it&#x27;s fun to play with.
评论 #41832290 未加载
评论 #41825088 未加载
blackcat2017 个月前
Shameless plug, for anyone who&#x27;s interested in &quot;self-improvement&quot; agent check out StreamBench[1] where we benchmark and try out what&#x27;s essential for improvements in online settings. Basically we find feedback signal is vital and the stronger the signal the more improvement you can get if you were able to feed it back to the agent in terms of weights (LoRA) or in-context examples.<p>[1] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2406.08747" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2406.08747</a>
jlopes27 个月前
Let’s see the code. A bit skeptical, this hasnt over complicated something architecturally. Need more clear drawings of architecture. What prompts exist, what tool calls are made, and what gets updated.
gdiamos7 个月前
Can it modify its training data?
评论 #41825840 未加载
YetAnotherNick7 个月前
&gt; For the Godel Agent, we utilize the “gpt-4o-2024-05-13” model (OpenAI et al., 2024), whereas the optimized policy and baseline models are evaluated using the “gpt-3.5-turbo-0125” model (OpenAI, 2022) to reduce computational costs and ensure a fair comparison.<p>Doesn&#x27;t seem fair at all.
digitcatphd7 个月前
I’m skeptical this would work in production better than RLHF, if the agent makes a mistake, how is it supposed to know to correct itself and understand what it did wrong to prevent it? It seems better to try again recursively until it finds the solution like a human
评论 #41826599 未加载
jondwillis7 个月前
That’s a lot of words, where is the code to reproduce?
kelseyfrog7 个月前
What a strange loop
optimalsolver7 个月前
&gt;The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks<p>No it hasn&#x27;t.
评论 #41886274 未加载
评论 #41826018 未加载
m3kw97 个月前
If their demo work, they must be close to AGI right?
pajeets7 个月前
meh im not convinced that any sort of framework or side tool that works on top of large language models is the solution<p>we really need something intelligent (no, o1 doesn&#x27;t count) and its unclear what that will look like. Perhaps it will be some RNN with neurosymbolism
评论 #41825494 未加载
评论 #41825247 未加载