TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Dot – A standalone open source app meant for easy use of local LLMs and RAG

185 pointsby irsagentabout 1 year ago

11 comments

reacharavindhabout 1 year ago
I’m curious to try it out. There seem to be many options to upload a document and ask stuff about it.<p>But, the holy grail is an LLM that can successfully work on a large corpus of documents and data like slack history, huge wiki installations and answer useful questions with proper references.<p>I tried a few, but they don’t really hit the mark. We need the usability of a simple search engine UI with private data sources.
评论 #39959916 未加载
评论 #39960505 未加载
评论 #39959416 未加载
评论 #39973214 未加载
评论 #39967453 未加载
评论 #39961695 未加载
NKosmatosabout 1 year ago
Looks promising, especially if you can select just your docs and avoid interacting with Mistral. I’ll give it a try to see how it performs. So far I’ve had mixed results with other similar solutions.<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39925316">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39925316</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39896923">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39896923</a>
logroabout 1 year ago
I have a reasonably wast library of technical&#x2F;scientific epubs&#x2F;documents. Could I use this to import them and the quiz the books?
评论 #39962683 未加载
MasterYodaabout 1 year ago
I have collected so much information in text files on my computer that it has become unmanageable to find anything. Now with local AI solutions, I wondered if I could create a smart search engine that could provide answers to the information that exists on my personal data.<p>My question is.<p>1 - Even if there is so much data that I can no longer find stuff, how much text data is needed to train an LLM to work ok? Im not after an AI that could answer general question, only an AI that should be able to answer what I already know exist in the data.<p>2 - I understand that the more structured the data are, the better, but how important is it when training an LLM with structured data? Does it just figuring stuff out anyways in a good way mostly?<p>3 - Any recommendation where to start, how to run an LLM AI locally, train on your own data?
评论 #39971019 未加载
gavmorabout 1 year ago
Thanks for sharing! I look forward to playing with this once I get off my phone. Took a look at the code, though, to see if you&#x27;ve implemented any of the tricks I&#x27;ve been too lazy to try.<p>`text_splitter=RecursiveCharacterTextSplitter( chunk_size=8000, chunk_overlap=4000)`<p>Does this simple numeric chunking approach actually work? Or are more sophisticated splitting rules going to make a difference?<p>`vector_store_ppt=FAISS.from_documents(text_chunks_ppt, embeddings)`<p>So we&#x27;re embedding all 8000 chars behind a single vector index. I wonder if certain documents perform better at this fidelity than others. To say nothing of missed &quot;prompt expansion&quot; opportunities.
评论 #39970993 未加载
eole666about 1 year ago
Looks nice! But some informations about the hardware requirement are often missing in this kind of project :<p>- how much ram is needed<p>- what CPU do you need for decent performances<p>- can it run on a GPU? And if it does how much vram do you need &#x2F; does it work only on Nvidia?
评论 #39960332 未加载
评论 #39961969 未加载
turnsoutabout 1 year ago
Curious about the choice of FAISS. It&#x27;s a bit older now, and there are many more options for creating and selecting embeddings. Does FAISS still offer some advantages?
评论 #39961612 未加载
评论 #39961163 未加载
PhilippGilleabout 1 year ago
Previous submission from 20 days ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39734406">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39734406</a>
mdrznabout 1 year ago
Tried giving it a folder with a bunch of .pdfs, it takes soooo long to index them (and there&#x27;s no progress bar or status indicator anywhere), and once I ask a question it&#x27;s just stuck on &quot;Dot is typing&quot; for an hour. Maybe add an option to stream the output, at least I understand if it&#x27;s doing something or not?
pentagramaabout 1 year ago
Not sure if install the Windows GPU or CPU app version [1].<p>I have:<p>Processor: Ryzen 5 3600<p>Video card: Geforce GTX 1660 TI 6Gb DDR6 (Zotac)<p>RAM: 16Gb DDR4 2666mhz<p>Any recommendations?<p>[1] <a href="https:&#x2F;&#x2F;dotapp.uk&#x2F;download.html" rel="nofollow">https:&#x2F;&#x2F;dotapp.uk&#x2F;download.html</a>
评论 #39961567 未加载
bee_riderabout 1 year ago
Imagine the marketing coup, when we’re all saying “Machine learning? Eh, it’s all just a bunch of Dot’s products.”