科技回声

6 条评论

RAG and embedding-based search are the same thing AFAIK.My approach is to stuff as many documents as possible directly into the context. The context windows of frontier models are large enough for my use case of ~20-40 documents. Context windows are 128K tokens for gpt-4o/o1/o3 and 1M for Gemini.When stuffing all of them in one query isn't possible, split the documents into multiple queries and aggregate the answers.I've tried RAG. But matching query embeddings to chunk embeddings isn't that straightforward. I noticed that relevant content was missed even with my modest number of documents. Semantic matching using query embeddings is one level above dumb keyword-matching but one level below direct queries to LLMs.Direct LLM queries seem to perform the best especially when some intermediate understanding is required (like "Based on these documents, infer the industries where X technique may be useful"). That's not possible with simple embedding search unless some of the documents specifically use the umbrella word "industry" or its close synonyms.Embedding search can probably be improved - like generating a synthetic answer and matching that answer's embedding to chunk embeddings. But I haven't tried such techniques.

epirogov3 个月前

Hello, I found Aspose released LLM plugin:<a href="https://products.aspose.net/pdf/chat-gpt/" rel="nofollow">https://products.aspose.net/pdf/chat-gpt/</a>At glance I see it supports some advanced features:Automatic detection of multiple languages. Batching requests for reduce LLM API call frequency to lower operational costs.

constantinum3 个月前

There is one with Langchain+pydantic+llmwhisperer <a href="https://unstract.com/blog/comparing-approaches-for-using-llms-for-structured-data-extraction-from-pdfs/" rel="nofollow">https://unstract.com/blog/comparing-approaches-for-using-llm...</a>

muzani3 个月前

Langchain was the OG for PDF RAG. You don't need fine tuning or anything, it does embedding based search right out of the box.

ratg133 个月前

Microsoft co-pilot does this out of the boxJust upload your documents to a OneDrive, Sharepoint, or Teams Site that you have access to and just start asking questions.

mayoosh3 个月前

I think you can just give the full PDF to gemini 2.0 flash using their labs UI and then chat with it.

6 条评论

lovelearning3 个月前

epirogov3 个月前

constantinum3 个月前

muzani3 个月前

Langchain was the OG for PDF RAG. You don't need fine tuning or anything, it does embedding based search right out of the box.

ratg133 个月前

Microsoft co-pilot does this out of the boxJust upload your documents to a OneDrive, Sharepoint, or Teams Site that you have access to and just start asking questions.

mayoosh3 个月前

I think you can just give the full PDF to gemini 2.0 flash using their labs UI and then chat with it.

Ask HN: Best LLM Stack for Q&A over Internal PDFs?

6 条评论

Ask HN: Best LLM Stack for Q&A over Internal PDFs?

6 条评论