Langchain agents works not bad with multiple tools however until the final output is given, the response times are usually around 40 seconds. Has anyone tried out an architecture that made them much more faster without compromising on the output quality?
I have been using sllim (note: I am the author) which allows me to easily parallelize the network requests to speed up agents.<p>It is only helpful with distributed chains (like tree of thoughts) and doesn't help with sequential chains.<p>link: <a href="https://news.ycombinator.com/item?id=36913492">https://news.ycombinator.com/item?id=36913492</a>