Home

10 comments

analyte123about 1 year ago
The tutorials, examples, demos, and blogs for every vector DB or RAG system that I know of focus mostly on toy datasets far, far under a million tokens. So most people can be forgiven for the view that long context would kill RAG - and it really should for these trivial use cases.<p>I don&#x27;t know if vector DB and RAG vendors don&#x27;t demo their software with legitimately large datasets because they don&#x27;t want to compete with their customers, or because they&#x27;re not confident enough in the results, or because they don&#x27;t really use their software themselves.<p>To give an example of a RAG dataset I have played with, it is about 10k documents and 5M tokens. Adding prior versions, expanding the coverage, or augmenting from other sources I&#x27;m sure it could blow well past 10M tokens. Maybe extremely long context models will get extremely fast and cheap, but at least for the next couple years you probably won&#x27;t be stuffing this all in the context. Even for a 1M context you still have to filter, retrieve, and rank the documents. And 10,000 documents is really not a lot compared to other corpora you could imagine (e-mails, media, law, science, code, etc). But it is certainly bigger than many of the use cases being pitched for RAG - like a few hundred personal notes or a corporate wiki with 500 poorly maintained pages on it.
评论 #39638137 未加载
cthalupaabout 1 year ago
I have no idea if massive context will kill RAGs, but an article written by someone who sells vector databases for a living is not the most conflict-of-interest free source.
评论 #39643084 未加载
Havocabout 1 year ago
I don’t understand why it is phrased as either or at all. The two techniques seem to come with very different trade offs which can be harnessed appropriately
xrdabout 1 year ago
Are there open models that support these extremely long context lengths? It seems that for a RAG like functionality, this is much easier than retraining the model, even if the quality isn&#x27;t always perfect. For personal usage I&#x27;m very interested in this application.
评论 #39637541 未加载
评论 #39636553 未加载
gtr32xabout 1 year ago
Using a massive context window is akin to onboarding a new employee before every mundane task performed. While a trained employee will take new task easily with existing context.<p>The trade off is simply cost. The cost remains in the LLM scene with regards to speed of execution and token cost.
refulgentisabout 1 year ago
Fitting things into 4096 tokens is an advantage of RAG, but quality of answers is night &amp; day when you have actual sources. It also commodifies web indexing, which is interesting
_boffin_about 1 year ago
I’ve said it once and j in I’ll say it again, cost per a token is still part of the equation. Why throw money away by stuffing the context with unneeded tokens?
评论 #39637470 未加载
joshellingtonabout 1 year ago
Clickbait marketing blog post. Doesn’t belong on FP IMO.
superchinkabout 1 year ago
No. (article supports this too)
评论 #39636354 未加载
wokwokwokabout 1 year ago
&gt; It is a proven solution that effectively addresses fundamental LLM challenges such as hallucinations and lacking domain-specific knowledge.<p>mm.<p>&gt; While RAG has proven beneficial in reducing LLM hallucinations, it does have limitations.<p>mhm.<p>&gt; [good models] support 32k-token-long contexts, showcasing a substantial improvement in embedding capabilities. This enhancement in embedding unstructured data also elevates RAG’s understanding of long contexts.<p>mmmmhmmm.<p>So, let me get this straight.<p>A model with a long context makes RAG significantly more effective because you can put more context into the input.<p>...but a model with a <i>really massive context window</i> won&#x27;t be significantly better?<p>??<p>&gt; Vector databases, one of the cutting-edge AI technologies, are a core component in the RAG pipeline. Opting for a more mature and advanced vector database, such as Milvus,<p>&gt; Conclusion: RAG Remains a Linchpin for the Sustained Success of AI Applications.<p>Aha! So, this is a sales pitch. Right.<p>No. You&#x27;re wrong.<p>RAG, as it currently exists, is a dead end technology. It exists because models don&#x27;t have a large enough context window.<p>If models get a significantly larger context window, it will become mostly irrelevant.<p>Obviously, there always going to be some technical &#x2F; compute limitations on the window size, and at some level, you&#x27;ll always need to filter down to the relevant context to put into the input, so yes, technically the approach will always be around <i>in some form</i>.<p>However, RAG in it&#x27;s <i>current form</i>, where you have a tiny context window and you put little vector db located snippets in it, well.... lets just say, if I was a vendor for a vector database product, I&#x27;d also be worried and also be producing opinion pieces like this.<p>An open model with a massive context would solve these problems trivially for most people, and make most &#x27;vector db&#x27; products unnecessary for most uses.
评论 #39636703 未加载