Home
10 comments
analyte123about 1 year ago
The tutorials, examples, demos, and blogs for every vector DB or RAG system that I know of focus mostly on toy datasets far, far under a million tokens. So most people can be forgiven for the view that long context would kill RAG - and it really should for these trivial use cases.<p>I don't know if vector DB and RAG vendors don't demo their software with legitimately large datasets because they don't want to compete with their customers, or because they're not confident enough in the results, or because they don't really use their software themselves.<p>To give an example of a RAG dataset I have played with, it is about 10k documents and 5M tokens. Adding prior versions, expanding the coverage, or augmenting from other sources I'm sure it could blow well past 10M tokens. Maybe extremely long context models will get extremely fast and cheap, but at least for the next couple years you probably won't be stuffing this all in the context. Even for a 1M context you still have to filter, retrieve, and rank the documents. And 10,000 documents is really not a lot compared to other corpora you could imagine (e-mails, media, law, science, code, etc). But it is certainly bigger than many of the use cases being pitched for RAG - like a few hundred personal notes or a corporate wiki with 500 poorly maintained pages on it.
评论 #39638137 未加载
cthalupaabout 1 year ago
I have no idea if massive context will kill RAGs, but an article written by someone who sells vector databases for a living is not the most conflict-of-interest free source.
评论 #39643084 未加载
Havocabout 1 year ago
I don’t understand why it is phrased as either or at all. The two techniques seem to come with very different trade offs which can be harnessed appropriately
xrdabout 1 year ago
Are there open models that support these extremely long context lengths? It seems that for a RAG like functionality, this is much easier than retraining the model, even if the quality isn't always perfect. For personal usage I'm very interested in this application.
评论 #39637541 未加载
评论 #39636553 未加载
gtr32xabout 1 year ago
Using a massive context window is akin to onboarding a new employee before every mundane task performed. While a trained employee will take new task easily with existing context.<p>The trade off is simply cost. The cost remains in the LLM scene with regards to speed of execution and token cost.
refulgentisabout 1 year ago
Fitting things into 4096 tokens is an advantage of RAG, but quality of answers is night & day when you have actual sources. It also commodifies web indexing, which is interesting
_boffin_about 1 year ago
I’ve said it once and j in I’ll say it again, cost per a token is still part of the equation. Why throw money away by stuffing the context with unneeded tokens?
评论 #39637470 未加载
joshellingtonabout 1 year ago
Clickbait marketing blog post. Doesn’t belong on FP IMO.
wokwokwokabout 1 year ago
> It is a proven solution that effectively addresses fundamental LLM challenges such as hallucinations and lacking domain-specific knowledge.<p>mm.<p>> While RAG has proven beneficial in reducing LLM hallucinations, it does have limitations.<p>mhm.<p>> [good models] support 32k-token-long contexts, showcasing a substantial improvement in embedding capabilities. This enhancement in embedding unstructured data also elevates RAG’s understanding of long contexts.<p>mmmmhmmm.<p>So, let me get this straight.<p>A model with a long context makes RAG significantly more effective because you can put more context into the input.<p>...but a model with a <i>really massive context window</i> won't be significantly better?<p>??<p>> Vector databases, one of the cutting-edge AI technologies, are a core component in the RAG pipeline. Opting for a more mature and advanced vector database, such as Milvus,<p>> Conclusion: RAG Remains a Linchpin for the Sustained Success of AI Applications.<p>Aha! So, this is a sales pitch. Right.<p>No. You're wrong.<p>RAG, as it currently exists, is a dead end technology. It exists because models don't have a large enough context window.<p>If models get a significantly larger context window, it will become mostly irrelevant.<p>Obviously, there always going to be some technical / compute limitations on the window size, and at some level, you'll always need to filter down to the relevant context to put into the input, so yes, technically the approach will always be around <i>in some form</i>.<p>However, RAG in it's <i>current form</i>, where you have a tiny context window and you put little vector db located snippets in it, well.... lets just say, if I was a vendor for a vector database product, I'd also be worried and also be producing opinion pieces like this.<p>An open model with a massive context would solve these problems trivially for most people, and make most 'vector db' products unnecessary for most uses.
评论 #39636703 未加载