TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How we improved GPT-4o multi-step function calling success rate by 4x

13 pointsby jimminyx6 months ago

6 comments

doctorpangloss6 months ago
They have identified a big problem with frontier models’ function calling which is that it doesn’t really work with more than 3 functions but:<p>&gt; instead of allowing the agent complete freedom in choosing from all possible API calls, AGS only presents the contextually relevant options based on where the agent is in its workflow.<p>Sounds like it will have to be bespoke to each task. Joe Blow enterprise PM farming this out to Jerald Blophus tech agency farming it out to BlowStar Solutions: they’re not going to be able to do this one.<p>If it doesn’t look like web development, where you have Yavascript and npx create-app something something, you haven’t solved the DX problem either.<p>It’s hard to find an organic looking conversation that would lead to each permutation and some loops inside of it of your tool calls, regardless of Xpanders method or not. If you don’t test you might as well have an agent that can only call one function. This is one of many reasons that guided OpenAI, I’m sure, to train on just a few functions available to call, and it’s frustrating to read any blog post that doesn’t address “Why don’t the frontier model developers just do this themselves?”
momopoco6 months ago
Isn’t this just langgraph?
puppycodes6 months ago
It drives me absolutely nuts when these companies bury their pricing behind a sales wall or a signup. It makes you look either shady or too expensive or both. If your free and open source then please display it because you should be proud. Don&#x27;t waste our time if we can&#x27;t afford it.
tlarkworthy6 months ago
Isn&#x27;t this obvious when you work with a stochastic system that giving it tons of wrong moves is gonna increase the failure rate.
OutOfHere6 months ago
I don&#x27;t see why this filtering is difficult for the user to do directly. It seems trivial to do as needed.
petesergeant6 months ago
This doesn&#x27;t appear to be a &quot;how we improved&quot; article, it looks like a press-release for some product