TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What's the state of LLM knowledge retrieval in Jan 2024?

20 点作者 chenxi9649超过 1 年前
Say I have a huge corpus of unstructured data. What&#x27;s the most effective way to get a model that can product great answers on that data?<p>Is RAG is still the way to go? Should one fine tune the model on top of that data as well?<p>It seems that getting RAG to work well requires a lot of optimization. Are there many drag n drop solutions that work well? I know the open AI assistant API has a built-in knowledge retrieval, anyone has experience how good that is compared to other methods?<p>or is it better to pre train a custom model and instruct train it?<p>Would love to know what you guys are all doing!

2 条评论

cl42超过 1 年前
Fine-tuning is not a good approach to integrating new knowledge into an LLM. It&#x27;s a good way to drive the direction of the LLM&#x27;s style, structure of responses (e.g., length, format).<p>I&#x27;d say RAG is still very much the way to go. What you need to then do is optimize how you chunk and embed data into the RAG database. Pinecone has a good post on this[1] and I believe others[2] are working on more automated solutions.<p>If you want a more generalized idea here, what state of the art (SOTA) models seems to be doing is using a more general &quot;second brain&quot; for LLMs to obtain information. This can be in the form of RAG, as per above, or in the form of more complex and rigorous models. For example, AlphaGeometry[3] uses an LLM combined with a geometry theorem prover to find solutions to problems.<p>[1] <a href="https:&#x2F;&#x2F;www.pinecone.io&#x2F;learn&#x2F;chunking-strategies&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.pinecone.io&#x2F;learn&#x2F;chunking-strategies&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;unstructured.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;unstructured.io&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;deepmind.google&#x2F;discover&#x2F;blog&#x2F;alphageometry-an-olympiad-level-ai-system-for-geometry&#x2F;" rel="nofollow">https:&#x2F;&#x2F;deepmind.google&#x2F;discover&#x2F;blog&#x2F;alphageometry-an-olymp...</a>
mfalcon超过 1 年前
I think there is a lot of ground to cover in &quot;RAG&quot;. Most of the demos or tutorials online seem to simply use a vector database to retrieve similar documents according to a cosine distance.<p>I&#x27;m now working on a &quot;hybrid&quot; search combining lexical and semantic search, using an LLM to translate a user message into a search query to retrieve data.<p>As far as I know, there&#x27;s not a &quot;standard&quot;, the field keeps moving and there are no simple answers.