TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

LLM Powered Autonomous Agents

285 点作者 DanielKehoe将近 2 年前

14 条评论

TekMol将近 2 年前
Do I understand it correctly, that an LLMs are neural networks which only ever output a single &quot;token&quot;, which is a short string of a few chars? And then the whole input plus that output is fed back into the NN to produce the next token?<p>So if you ask ChatGPT &quot;Describe Berlin&quot;, what happens is that the NN is called 6 times with these inputs:<p><pre><code> Input: Describe Berlin. Outpu: Berlin Input: Describe Berlin. Berlin Outpu: is Input: Describe Berlin. Berlin is Outpu: a Input: Describe Berlin. Berlin is a Outpu: nice Input: Describe Berlin. Berlin is a nice Outpu: city Input: Describe Berlin. Berlin is a nice city Outpu: . </code></pre> ChatGPT&#x27;s answer:<p><pre><code> Berlin is a nice city. </code></pre> Is that how LLMs work?
评论 #36490407 未加载
评论 #36490219 未加载
评论 #36489789 未加载
评论 #36490832 未加载
评论 #36490226 未加载
评论 #36489756 未加载
评论 #36491343 未加载
评论 #36489965 未加载
评论 #36491152 未加载
评论 #36489958 未加载
评论 #36489757 未加载
评论 #36491153 未加载
评论 #36490533 未加载
评论 #36492241 未加载
评论 #36489988 未加载
swyx将近 2 年前
&gt; Challenges in long-term planning and task decomposition: Planning over a lengthy history and effectively exploring the solution space remain challenging. LLMs struggle to adjust plans when faced with unexpected errors, making them less robust compared to humans who learn from trial and error.<p>While working on smol-developer I also eventually landed on the importance of planning as mentioned in my Agents writeup <a href="https:&#x2F;&#x2F;www.latent.space&#x2F;p&#x2F;agents" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.latent.space&#x2F;p&#x2F;agents</a> . I feel some hesitation with suggesting this because it suggests I&#x27;m not deep-learning-pilled, but I really wonder how far next-token-prediction can go with planning. When I think about planning I think about mapping out a possibility space, identifying trees of dependencies, assigning priorities, and then solving for some kind of weighted shortest path. That&#x27;s an awful lot of work to expect of a next token predictor (goodness knows its scaled far beyond what anyone thought - is there any limit to next token prediction?).<p>If there were one focus area for GPT-5, my money would be on a better architecture capable of planning.
评论 #36489818 未加载
评论 #36489711 未加载
评论 #36489580 未加载
评论 #36489783 未加载
评论 #36489541 未加载
评论 #36490766 未加载
评论 #36490312 未加载
Roark66将近 2 年前
Seriously LLMs are remarkable tools, but they are horribly unreliable. What tasks could such autonomous agent do (beyond what a chat bot, perhaps extended with web access, already does)? I mean which task is so complex one can&#x27;t just automate it with simple scripting and non critical if it goes wrong to the point of letting an AI LLM do it? BTW, running those models is rather expensive so also the task has to be quite expensive now, perhaps completed by a human.
评论 #36490875 未加载
评论 #36490842 未加载
评论 #36499558 未加载
评论 #36491529 未加载
评论 #36490840 未加载
评论 #36491828 未加载
评论 #36491066 未加载
novaRom将近 2 年前
How small can be a LLM transformer in order to be able to understand basic human language and search for answers on the internet? It should not contain all the facts and knowledge, but must be quick (so, it&#x27;s a small model), understand at least one language, and know how and where to look for answers.<p>Would it be sufficient to have 1B, 3B or 7B parameters to achieve this? Or is it doable with 100M or even fewer parameters? I mean vocabulary size might be quite small, max context size could also be limited to 256 or 512 tokens. Is there any paper on that maybe?
评论 #36489718 未加载
评论 #36489609 未加载
评论 #36491217 未加载
snowcrash123将近 2 年前
Good read. Currently there are lot of issues in autonomous agents apart from Finite context length, task decomposition and natural language as interface mentioned in the article. I think for agents to truly find adoption in real world, agent trajectory fine tuning is critical component - how do you make an agent perform better to achieve particular objective with every subsequent run. Basically making the agents learn similar to how we learn when we<p>Also I think current LLMs might not fit well for agent use cases in mid to long term because the RL they go through is based on input-best output methods whereas the intelligence that you need in agents is more around how to build an algorithm to achieve an objective on the fly - this requires perhaps new type of large models ( Large Agent Models ? ) which are trained using RLfD ( Reinforcement Learning from demonstration )<p>Also I think one of the key missing piece is a highly configurable software middle ware between Intelligence ( LLMs ), Memory ( Vector Dbs ~LTMs, STMs ), Tools and workflows across every iteration. Current agent core loop to find next best action is too simplistic. For example if core self prompting loop or iteration of an agent can be configured for the use case in hand. Eg for BabyAGI, every iteration goes through workflow of Plan, Prioritize and Execute or in AutoGPT it finds the next best action based on LTM&#x2F;STM, or GPTEngineer it is to write specs &gt; write tests &gt; write code. Now for dev infra monitoring agent this workflow might be totally different - it would look like consume logs from different tools like Grafana, Splunk, APMs &gt; See if it doesnt have an anomaly &gt; if it has an anomaly then take human input for feedback. Every use case in real world has it&#x27;s own workflow and current construct of agent frameworks have this thing hard coded in base prompt. In SuperAGI( <a href="https:&#x2F;&#x2F;superagi.com" rel="nofollow noreferrer">https:&#x2F;&#x2F;superagi.com</a>) ( disclaimer : Im creator of it ), core iteration workflow of agent can be defined as part of agent provisioning.<p>Another missing piece is notion of Knowledge. Agents currently depend entirely upon knowledge of LLMs or search results to execute on tasks, but if a specialised knowledge set is plugged to an agent, it performs significantly better.
p-e-w将近 2 年前
A fairly lengthy article about Autonomous AI, and, as far as I can tell, <i>not a single word about the safety implications of such a system</i> (a short note about reliability of LLMs is all we get, and it&#x27;s not clear that the author means anything more than &quot;the thing might break&quot;).<p>I get that there are different philosophies on AI risk, but this is like reading an in-depth discussion about potential atomic bombs in 1942, with no mention of the fact that such a bomb could potentially level cities and kill millions.<p>If a field of research ever needed oversight and regulation, this is it. I&#x27;m not convinced that would solve the problem, but allowing this kind of rushing forward to continue is madness.
评论 #36490629 未加载
评论 #36490073 未加载
评论 #36491238 未加载
评论 #36490032 未加载
mercurialsolo将近 2 年前
Autonomy without alignment is a slippery road.<p>Autonomous agents need guardrails and oversight. An autonomous agent let loose with all the tools in the world will in essence lead to an outcome which is not predicted to be in our favour.<p>Which is why the Open AI app store and plugins scare me more than anything else - more likely than not they are tool and data feeders into a large scale autonomous system.
评论 #36490205 未加载
Animats将近 2 年前
It&#x27;s plausible, but it was partly written by ChatGPT. From the paper: &quot;Big thank you to ChatGPT for helping me draft this section.&quot; So of course it&#x27;s plausible. That&#x27;s what ChatGPT does.<p>There are a number of systems where someone bolted together components like this. Now we need more demos and evaluations of them. I just read a comment about someone who tried a sales chatbot. It would make up nonexistent products to respond to customer requests.<p>The underlying LLM systems need some kind of confidence metric output.
m3kw9将近 2 年前
AutoGPT and babyagi is taking a very long nap till gpt5 comes out
arisAlexis将近 2 年前
Make me some paperclips please. Do not kill any humans.
ChatGTP将近 2 年前
<i>However, the reliability of model outputs is questionable, as LLMs may make formatting errors and occasionally exhibit rebellious behavior (e.g. refuse to follow an instruction).</i><p>Right…sounds quite reckless?
评论 #36493971 未加载
gremlinsinc将近 2 年前
wow, this was a brilliant summary of much research into ai agents, I&#x27;ve been reading a lot about these and following this stuff, but learned a lot.
Xen9将近 2 年前
I predict this type of cognitive engineering of LLM-multiagents to become a thing.
atomlib将近 2 年前
Coolest post on this subreddit.
评论 #36491195 未加载