TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Evaluating modular RAG with reasoning models

62 点作者 emil_sorensen3 个月前

7 条评论

serjester3 个月前
We tried something similar and found much better results with o1 pro than o3 mini. RAG seems to require a level of world knowledge that the mini models don’t have.<p>This comes at the cost of significantly higher latency and cost. But for us, answer quality is a much higher priority.
评论 #43184987 未加载
评论 #43184431 未加载
SubiculumCode3 个月前
I found it interesting the parts that discussed current limitations of llm&#x27;s understanding of tools, despite apparent reasoning abilities, it didn&#x27;t seem to have an intuitive understanding of when to use the specific search tools.<p>I wonder whether this would benefit from a fine tuned llm module for that specific step, or even by providing a set of examples in the prompt of when to use what tool?
EngineeringStuf3 个月前
Am I correct in reading that the RAG pipeline runs in realtime in response to a user query?<p>If so, then I would suggest that you run it ahead of time and generate possible questions from the LLM based on the context of the current semantically split chunk.<p>That way you only need to compare the embeddings at query time and it will already be pre-sorted and ranked.<p>The trick, of course, is chunking it correctly and generating the right questions. But in both cases I would look to the LLM to do that.<p>Happy to recommend some tips on semantically splitting documents using the LLM with really low token usage if you&#x27;re interested.
评论 #43184607 未加载
评论 #43185847 未加载
评论 #43184567 未加载
aantix3 个月前
When aggregating data from multiple systems, how do you handle the case of only searching against data chunks that the user is authorized to view? And if those permissions change?
评论 #43185027 未加载
anonymousDan3 个月前
Is RAG any good for coding tasks?
评论 #43184556 未加载
评论 #43182266 未加载
mkesper3 个月前
Latency must be brutal here. This will not be possible for any chat application, I guess.
评论 #43181759 未加载
评论 #43181745 未加载
emil_sorensen3 个月前
Curious if anyone else has run similar experiments?
评论 #43182716 未加载