TechEcho

11 comments

I’m curious to try it out. There seem to be many options to upload a document and ask stuff about it.But, the holy grail is an LLM that can successfully work on a large corpus of documents and data like slack history, huge wiki installations and answer useful questions with proper references.I tried a few, but they don’t really hit the mark. We need the usability of a simple search engine UI with private data sources.

评论 #39959916 未加载

评论 #39960505 未加载

评论 #39959416 未加载

评论 #39973214 未加载

评论 #39967453 未加载

评论 #39961695 未加载

NKosmatosabout 1 year ago

Looks promising, especially if you can select just your docs and avoid interacting with Mistral. I’ll give it a try to see how it performs. So far I’ve had mixed results with other similar solutions.<a href="https://news.ycombinator.com/item?id=39925316">https://news.ycombinator.com/item?id=39925316</a><a href="https://news.ycombinator.com/item?id=39896923">https://news.ycombinator.com/item?id=39896923</a>

logroabout 1 year ago

I have a reasonably wast library of technical/scientific epubs/documents. Could I use this to import them and the quiz the books?

评论 #39962683 未加载

MasterYodaabout 1 year ago

I have collected so much information in text files on my computer that it has become unmanageable to find anything. Now with local AI solutions, I wondered if I could create a smart search engine that could provide answers to the information that exists on my personal data.My question is.1 - Even if there is so much data that I can no longer find stuff, how much text data is needed to train an LLM to work ok? Im not after an AI that could answer general question, only an AI that should be able to answer what I already know exist in the data.2 - I understand that the more structured the data are, the better, but how important is it when training an LLM with structured data? Does it just figuring stuff out anyways in a good way mostly?3 - Any recommendation where to start, how to run an LLM AI locally, train on your own data?

评论 #39971019 未加载

gavmorabout 1 year ago

Thanks for sharing! I look forward to playing with this once I get off my phone. Took a look at the code, though, to see if you've implemented any of the tricks I've been too lazy to try.`text_splitter=RecursiveCharacterTextSplitter( chunk_size=8000, chunk_overlap=4000)`Does this simple numeric chunking approach actually work? Or are more sophisticated splitting rules going to make a difference?`vector_store_ppt=FAISS.from_documents(text_chunks_ppt, embeddings)`So we're embedding all 8000 chars behind a single vector index. I wonder if certain documents perform better at this fidelity than others. To say nothing of missed "prompt expansion" opportunities.

评论 #39970993 未加载

eole666about 1 year ago

Looks nice! But some informations about the hardware requirement are often missing in this kind of project :- how much ram is needed- what CPU do you need for decent performances- can it run on a GPU? And if it does how much vram do you need / does it work only on Nvidia?

评论 #39960332 未加载

评论 #39961969 未加载

turnsoutabout 1 year ago

Curious about the choice of FAISS. It's a bit older now, and there are many more options for creating and selecting embeddings. Does FAISS still offer some advantages?

评论 #39961612 未加载

评论 #39961163 未加载

PhilippGilleabout 1 year ago

Previous submission from 20 days ago: <a href="https://news.ycombinator.com/item?id=39734406">https://news.ycombinator.com/item?id=39734406</a>

mdrznabout 1 year ago

Tried giving it a folder with a bunch of .pdfs, it takes soooo long to index them (and there's no progress bar or status indicator anywhere), and once I ask a question it's just stuck on "Dot is typing" for an hour. Maybe add an option to stream the output, at least I understand if it's doing something or not?

pentagramaabout 1 year ago

Not sure if install the Windows GPU or CPU app version [1].I have:Processor: Ryzen 5 3600Video card: Geforce GTX 1660 TI 6Gb DDR6 (Zotac)RAM: 16Gb DDR4 2666mhzAny recommendations?[1] <a href="https://dotapp.uk/download.html" rel="nofollow">https://dotapp.uk/download.html</a>

评论 #39961567 未加载

bee_riderabout 1 year ago

Imagine the marketing coup, when we’re all saying “Machine learning? Eh, it’s all just a bunch of Dot’s products.”

11 comments

reacharavindhabout 1 year ago

评论 #39959916 未加载

评论 #39960505 未加载

评论 #39959416 未加载

评论 #39973214 未加载

评论 #39967453 未加载

评论 #39961695 未加载

NKosmatosabout 1 year ago

logroabout 1 year ago

I have a reasonably wast library of technical/scientific epubs/documents. Could I use this to import them and the quiz the books?

评论 #39962683 未加载

MasterYodaabout 1 year ago

评论 #39971019 未加载

gavmorabout 1 year ago

评论 #39970993 未加载

eole666about 1 year ago

评论 #39960332 未加载

评论 #39961969 未加载

turnsoutabout 1 year ago

Curious about the choice of FAISS. It's a bit older now, and there are many more options for creating and selecting embeddings. Does FAISS still offer some advantages?

评论 #39961612 未加载

评论 #39961163 未加载

PhilippGilleabout 1 year ago

Previous submission from 20 days ago: <a href="https://news.ycombinator.com/item?id=39734406">https://news.ycombinator.com/item?id=39734406</a>

mdrznabout 1 year ago

pentagramaabout 1 year ago

评论 #39961567 未加载

bee_riderabout 1 year ago

Imagine the marketing coup, when we’re all saying “Machine learning? Eh, it’s all just a bunch of Dot’s products.”

Dot – A standalone open source app meant for easy use of local LLMs and RAG

11 comments

Dot – A standalone open source app meant for easy use of local LLMs and RAG

11 comments