TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

DeepRAG: Thinking to retrieval step by step for large language models

191 pointsby fofoz4 months ago

3 comments

mkw50534 months ago
This reminds me of the Agent Workflow Memory (AWM) paper [1], which also tries to find optimal decision paths for LLM-based agents but relies on in-context learning, whereas DeepRAG fine-tunes models to decide when to retrieve external knowledge.<p>I’ve been thinking about how modifying AWM to use fine-tuning or an external knowledge system (RAG) might work—capturing the ‘good’ workflows it discovers rather than relying purely on prompting.<p>[1] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2409.07429" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2409.07429</a> - Agent Workflow Memory (Wang et al., 2024)
brunohaid4 months ago
Noice!<p>Does anyone have a good recommendation for a local dev setup that does something similar with available tools? Ie incorporates a bunch of PDFs (~10,000 pages of datasheets) and other docs, as well as a curl style importer?<p>Trying to wean myself off the next tech molochs, ideally with local functionality similar to OpenAIs Search + Reason, and gave up on Langchain during my first attempt 6 months ago.
评论 #42938121 未加载
评论 #42936683 未加载
评论 #42935700 未加载
评论 #42936759 未加载
评论 #42942761 未加载
jondwillis4 months ago
The title reads awkwardly to a native English speaker. A search of the PDF for &quot;latency&quot; returns one result, discussing how naive RAG can result in latency. What are the latency impacts and other trade-offs to achieve the claimed &quot;[improved] answer accuracy by 21.99%&quot;? Is there any way that I could replicate these results without having to write my own implementation?