TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Evaluating modular RAG with reasoning models

62 pointsby emil_sorensen3 months ago

7 comments

serjester3 months ago
We tried something similar and found much better results with o1 pro than o3 mini. RAG seems to require a level of world knowledge that the mini models don’t have.<p>This comes at the cost of significantly higher latency and cost. But for us, answer quality is a much higher priority.
评论 #43184987 未加载
评论 #43184431 未加载
SubiculumCode3 months ago
I found it interesting the parts that discussed current limitations of llm&#x27;s understanding of tools, despite apparent reasoning abilities, it didn&#x27;t seem to have an intuitive understanding of when to use the specific search tools.<p>I wonder whether this would benefit from a fine tuned llm module for that specific step, or even by providing a set of examples in the prompt of when to use what tool?
EngineeringStuf3 months ago
Am I correct in reading that the RAG pipeline runs in realtime in response to a user query?<p>If so, then I would suggest that you run it ahead of time and generate possible questions from the LLM based on the context of the current semantically split chunk.<p>That way you only need to compare the embeddings at query time and it will already be pre-sorted and ranked.<p>The trick, of course, is chunking it correctly and generating the right questions. But in both cases I would look to the LLM to do that.<p>Happy to recommend some tips on semantically splitting documents using the LLM with really low token usage if you&#x27;re interested.
评论 #43184607 未加载
评论 #43185847 未加载
评论 #43184567 未加载
aantix3 months ago
When aggregating data from multiple systems, how do you handle the case of only searching against data chunks that the user is authorized to view? And if those permissions change?
评论 #43185027 未加载
anonymousDan3 months ago
Is RAG any good for coding tasks?
评论 #43184556 未加载
评论 #43182266 未加载
mkesper3 months ago
Latency must be brutal here. This will not be possible for any chat application, I guess.
评论 #43181759 未加载
评论 #43181745 未加载
emil_sorensen3 months ago
Curious if anyone else has run similar experiments?
评论 #43182716 未加载