If you're interested in the SQL component of this, we're building a product strictly focused on that at <a href="https://www.definite.app/" rel="nofollow noreferrer">https://www.definite.app/</a>. We let non-technical users ask questions of their SQL database. We do this by:<p>1. Pulling in your schema information and structuring it in a way LLM's can reason about it<p>2. Pulling in your prior query history against the database to understand how you actually use your data (e.g. what JOIN's are common, what tables are used most frequently, etc.)<p>3. Adding context from other tools you may be using (e.g. we can pull in metadata and tests from your dbt project)<p>We also have a Slackbot you can add to your #urgent-data-requests channel. If you @Definite in a thread, it'll parse out messages that can be converted to SQL tasks and return the answer from your database.<p>You could certainly build this yourself with (or without) LlamaIndex, but it's still quite a bit of work to set up.
Hey all! Jerry here (from LlamaIndex).<p>We love the feedback, and one main point especially seems to be around making the docs better:
- Improve the organization to better expose both our basic and our advanced capabilities
- Improve the documentation around customization (from LLM's to retrievers etc.)
- Improve the clarity of our examples/notebooks.<p>Will have an update in a day or two :)
There once existed Google Desktop which was really useful.<p>Is this something similar, but with the added feature of being able to query the data with the help of a LLM?<p>Like: Find me all the text files which I've modified last month, there should be one containing a log snippet with a TODO I added to it.
I gave this a shot a while back and found plenty of examples but little documentation.<p>For instance, there is a tree structure for storing the embeddings and the library is able to construct it with a single line. However, I couldn’t find an clear explanation of how that tree is constructed and how to take advantage of it.
If you’re doing legitimate retrieval rerank in the commercial enterprise setting, then I doubt this is a library that can support you beyond prototyping.<p>Retrieval involves complex integration (not just data connectors and open API wrappers), and meaningful rerank requires domain/context-specific trained models (that you can deploy performantly and cost effectively). If you’re doing these things, you’re well beyond the capability at platform scale vs what a python library provides