TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What's the state of LLM knowledge retrieval in Jan 2024?

20 pointsby chenxi9649over 1 year ago
Say I have a huge corpus of unstructured data. What&#x27;s the most effective way to get a model that can product great answers on that data?<p>Is RAG is still the way to go? Should one fine tune the model on top of that data as well?<p>It seems that getting RAG to work well requires a lot of optimization. Are there many drag n drop solutions that work well? I know the open AI assistant API has a built-in knowledge retrieval, anyone has experience how good that is compared to other methods?<p>or is it better to pre train a custom model and instruct train it?<p>Would love to know what you guys are all doing!

2 comments

cl42over 1 year ago
Fine-tuning is not a good approach to integrating new knowledge into an LLM. It&#x27;s a good way to drive the direction of the LLM&#x27;s style, structure of responses (e.g., length, format).<p>I&#x27;d say RAG is still very much the way to go. What you need to then do is optimize how you chunk and embed data into the RAG database. Pinecone has a good post on this[1] and I believe others[2] are working on more automated solutions.<p>If you want a more generalized idea here, what state of the art (SOTA) models seems to be doing is using a more general &quot;second brain&quot; for LLMs to obtain information. This can be in the form of RAG, as per above, or in the form of more complex and rigorous models. For example, AlphaGeometry[3] uses an LLM combined with a geometry theorem prover to find solutions to problems.<p>[1] <a href="https:&#x2F;&#x2F;www.pinecone.io&#x2F;learn&#x2F;chunking-strategies&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.pinecone.io&#x2F;learn&#x2F;chunking-strategies&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;unstructured.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;unstructured.io&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;deepmind.google&#x2F;discover&#x2F;blog&#x2F;alphageometry-an-olympiad-level-ai-system-for-geometry&#x2F;" rel="nofollow">https:&#x2F;&#x2F;deepmind.google&#x2F;discover&#x2F;blog&#x2F;alphageometry-an-olymp...</a>
mfalconover 1 year ago
I think there is a lot of ground to cover in &quot;RAG&quot;. Most of the demos or tutorials online seem to simply use a vector database to retrieve similar documents according to a cosine distance.<p>I&#x27;m now working on a &quot;hybrid&quot; search combining lexical and semantic search, using an LLM to translate a user message into a search query to retrieve data.<p>As far as I know, there&#x27;s not a &quot;standard&quot;, the field keeps moving and there are no simple answers.