"Research shows that AI systems with 30+ agents out-performs a simple LLM call in practically any task (see More Agents Is All You Need), reducing hallucinations and improving accuracy."<p>Has anyone heard of that actually playing out practically in real-world applications? This article links to the paper about it - <a href="https://arxiv.org/abs/2402.05120" rel="nofollow">https://arxiv.org/abs/2402.05120</a> - but I've not heard from anyone who's implementing production systems successfully in that way.<p>(I still don't actually know what an "agent" is, to be honest. I'm pretty sure there are dozens of conflicting definitions floating around out there by now.)
It's interesting how long the word "agents"/"intelligent agents" have been around for and how long they've been hyped up for. If you go back to the 80s and 90s you will see how Microsoft was hyping up "intelligent agents" in Windows but nothing ever became of it[1].<p>I have yet to see an actual useful usecase for agents despite the countless posts asking for examples nobody has provided one.<p>[1] <a href="https://www.wired.com/1995/09/future-forward/" rel="nofollow">https://www.wired.com/1995/09/future-forward/</a>
Is there an agent framework that lives up to the hype?<p>Where you specify a top-level objective, it plans out those objectives, it selects a completion metric so that it knows when to finish, and iterates/reiterates over the output until completion?
Can I just ask whether other people think that "agentic" is a word?<p>As far as I can tell it's not in the OED or Miriam Webster dictionaries. But recently everyone's using it so perhaps it soon will be.
If we structured AI agents like big tech org charts, which company structures would perform better? Inspired by James Huckle's thoughts on how organizational structures impact software design, I decided to put this to the test: <a href="https://bit.ly/ai-corp-agents" rel="nofollow">https://bit.ly/ai-corp-agents</a>.
- Big tech is very different from open source<p>- The original SWE-bench paper only consists of solved issues when a big part of Open Source is triage, follow-up, clarification and dealing with crappy issues<p>- Saying "<Technique> is all you need" when you are increasing your energy usage 30-fold just to fail > 50% of the time is intellectually dishonest