Ask HN: How to Build RAG?

3 点作者 punkpeye6 个月前

I am building another 'ask PDF a question' RAG.I have successfully converted PDF to markdown.Then I used Jina segmenter to split it into chunks.Each chunk is ~1000 characters long, but sometimes it is as short as just the title of the section.I have then stored all of these chunks in a vector database and use cosine distance to sort chunks, pick the first 100, and include the associated chunks into LLM prompt that's used to answer user's question.However...I feel like I am missing a step.The chunks that are returned by the query, while mostly relevant, they ...* do not include the full recipe * include snippets of unrelated recipesIs there a step am I missing?

2 条评论

chewz6 个月前

RAG is so yesterday.Upload entire PDF directly[1] to API, don't convert PDF to markdown, don't vectorise.. Put that in API cache [2] and keep asking questions.Chunking and vector search gives mediocre results [3]. Same with full-time search. Difficult to calibrate when structure of PDF is volatile.[1] - <a href="https://docs.anthropic.com/en/docs/build-with-claude/pdf-support" rel="nofollow">https://docs.anthropic.com/en/docs/build-with-claude/pdf-sup...</a>[2] - <a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching" rel="nofollow">https://docs.anthropic.com/en/docs/build-with-claude/prompt-...</a>[3] - This works but for well formated PDFs where you chunk intelligently and extract reasonable metadata.

评论 #42157671 未加载

lunarcave6 个月前

If you load something into the LLM context, there's a non-zero chance that it'll be referenced.How are you chunking things? Can you chunk it in a way that sidesteps the problem?It's kind of hard to give generic advice without knowing your PDF structure.But generally, you have two ways forward:- Optimise chunking to be more context aware of the chunked content - Allow the LLM to refer to adjacent chunks via some kind of a pointer

Ask HN: How to Build RAG?

3 点作者 punkpeye6 个月前

2 条评论

chewz6 个月前

评论 #42157671 未加载

lunarcave6 个月前