TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Best LLM Stack for Q&A over Internal PDFs?

11 点作者 samrohn3 个月前
I'm looking to build an LLM-based chatbot that can answer questions using a set of internal PDF documents. Has anyone worked on a similar use case with good success? What approach and LLM stack did you use to solve this - RAG (Retrieval-Augmented Generation), fine-tuning, or embedding-based search?

6 条评论

lovelearning3 个月前
RAG and embedding-based search are the same thing AFAIK.<p>My approach is to stuff as many documents as possible directly into the context. The context windows of frontier models are large enough for my use case of ~20-40 documents. Context windows are 128K tokens for gpt-4o&#x2F;o1&#x2F;o3 and 1M for Gemini.<p>When stuffing all of them in one query isn&#x27;t possible, split the documents into multiple queries and aggregate the answers.<p>I&#x27;ve tried RAG. But matching query embeddings to chunk embeddings isn&#x27;t that straightforward. I noticed that relevant content was missed even with my modest number of documents. Semantic matching using query embeddings is one level above dumb keyword-matching but one level below direct queries to LLMs.<p>Direct LLM queries seem to perform the best especially when some intermediate understanding is required (like &quot;Based on these documents, infer the industries where X technique may be useful&quot;). That&#x27;s not possible with simple embedding search unless some of the documents specifically use the umbrella word &quot;industry&quot; or its close synonyms.<p>Embedding search can probably be improved - like generating a synthetic answer and matching that answer&#x27;s embedding to chunk embeddings. But I haven&#x27;t tried such techniques.
epirogov3 个月前
Hello, I found Aspose released LLM plugin:<p><a href="https:&#x2F;&#x2F;products.aspose.net&#x2F;pdf&#x2F;chat-gpt&#x2F;" rel="nofollow">https:&#x2F;&#x2F;products.aspose.net&#x2F;pdf&#x2F;chat-gpt&#x2F;</a><p>At glance I see it supports some advanced features:<p>Automatic detection of multiple languages. Batching requests for reduce LLM API call frequency to lower operational costs.
constantinum3 个月前
There is one with Langchain+pydantic+llmwhisperer <a href="https:&#x2F;&#x2F;unstract.com&#x2F;blog&#x2F;comparing-approaches-for-using-llms-for-structured-data-extraction-from-pdfs&#x2F;" rel="nofollow">https:&#x2F;&#x2F;unstract.com&#x2F;blog&#x2F;comparing-approaches-for-using-llm...</a>
muzani3 个月前
Langchain was the OG for PDF RAG. You don&#x27;t need fine tuning or anything, it does embedding based search right out of the box.
ratg133 个月前
Microsoft co-pilot does this out of the box<p>Just upload your documents to a OneDrive, Sharepoint, or Teams Site that you have access to and just start asking questions.
mayoosh3 个月前
I think you can just give the full PDF to gemini 2.0 flash using their labs UI and then chat with it.