TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to enhance generative AI's problem-solving capabilities, boost productivity

40 点作者 monkeydust12 个月前

6 条评论

Animats12 个月前
The article starts out as if it&#x27;s headed for &quot;and that&#x27;s how we did it&quot;. But no. There&#x27;s no implementation.<p><i>&quot;Imagine a virtual team of AI agents, each with its workflow’s own specialism, collaborating to solve problems and make decisions just like a human team would.&quot;</i><p>OK. Where does that go? So far, multi-agent systems have been delegating simple and well-bounded tasks, such as &quot;fetch the weather info for Outer Nowhere&quot; or &quot;check airline schedules for flights from JFK to ORD&quot;, or even &quot;what is 25% of $50&quot;. Those are questions inexpensive to answer, and don&#x27;t need much management. If the subagents are complex, they will need management, and probably budgeting. Subagents need to know when to stop and when to approximate. If the subagents are themselves generative AI systems, there&#x27;s potential for hallucination at the lower levels generating info that the higher levels take as valid. Subagents also need to be able to query their managers - &quot;is this enough detail&quot; is a reasonable question to pass upwards. They may need to talk to their peer agents.<p>Now you have all the problems of organizational dynamics within a multi-agent AI system.<p>I look forward to reading papers with titles such as:<p>- &quot;Teams of generative AI agents for coding - scrum or waterfall?&quot;<p>- &quot;Span of control - how many subagents should an agent manage?&quot;<p>- &quot;Does the agent org chart influence the solution too much?&quot;<p>- &quot;Resolving disagreements between specialized subagents&quot;.<p>That&#x27;s where this is going. It has to. Once you start to cut a problem into pieces to be handled by different units, all those problems arise.
评论 #40484623 未加载
评论 #40484654 未加载
评论 #40485370 未加载
评论 #40484671 未加载
chx12 个月前
&gt; How to enhance generative AI&#x27;s problem-solving capabilities,<p>A zero multiplied by whatever is still zero.<p>It can not solve <i>anything</i> with one broad category of exceptions as <a href="https:&#x2F;&#x2F;hachyderm.io&#x2F;@inthehands&#x2F;112006855076082650" rel="nofollow">https:&#x2F;&#x2F;hachyderm.io&#x2F;@inthehands&#x2F;112006855076082650</a> brilliantly explains:<p>&gt; You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.<p>&gt; Alas, that does not remotely resemble how people are pitching this technology.
评论 #40486447 未加载
smarm5212 个月前
A nice summary of an idea similar to Multi-Agent Systems.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Multi-agent_system" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Multi-agent_system</a>
hubraumhugo12 个月前
I recently wrote a blog post about why &quot;AI agents&quot; are still too early, too expensive, too unreliable: <a href="https:&#x2F;&#x2F;www.kadoa.com&#x2F;blog&#x2F;ai-agents-hype-vs-reality" rel="nofollow">https:&#x2F;&#x2F;www.kadoa.com&#x2F;blog&#x2F;ai-agents-hype-vs-reality</a><p>The WebArena leaderboard[0], which benchmarks LLM agents against real-world tasks, shows that even the best-performing models have a success rate of only 35.8%.<p>[0] <a href="https:&#x2F;&#x2F;docs.google.com&#x2F;spreadsheets&#x2F;d&#x2F;1M801lEpBbKSNwP-vDBkC_pF7LdyGU1f_ufZb_NWNBZQ&#x2F;edit#gid=0" rel="nofollow">https:&#x2F;&#x2F;docs.google.com&#x2F;spreadsheets&#x2F;d&#x2F;1M801lEpBbKSNwP-vDBkC...</a>
评论 #40486193 未加载
NicoJuicy12 个月前
These things will have the same flaw as no-code tools.<p>No one is going to remember how the system works and all those prompt engineers are going to find out that programming languages are well documented, but things like migrations, multitenancy, ... aren&#x27;t.<p>Good luck when an AI api implements a breaking change in an API and people rely on it.<p>Or when a issue happens and it can&#x27;t find logs, ... ( If it was even implemented :p )
schmidtleonard12 个月前
&gt; The productivity benefits perhaps take us closer to the aspiration Keynes had when he wrote Economic Possibilities for our Grandchildren in 1930, in which he forecast that in a hundred years, thanks to technological advancements improving the standard of living, we could all be doing 15-hour work weeks.<p>Well, you see, the benefits have to be split between capital and labor.<p>The system is called &quot;capitalism.&quot;<p>Figure it out.
评论 #40484918 未加载