TechEcho

14 comments

This is the point of it:<a href="https://github.com/ggerganov/llama.cpp/pull/11016#issuecomment-2599740463">https://github.com/ggerganov/llama.cpp/pull/11016#issuecomme...</a>

评论 #42889234 未加载

评论 #42888339 未加载

评论 #42888267 未加载

评论 #42888141 未加载

mckirk4 months ago

This looks great!While we're at it, is there already some kind of standardized local storage location/scheme for LLM models? If not, this project could potentially be a great place to set an example that others can follow, if they want. I've been playing with different runtimes (Ollama, vLLM) the last days, and I really would have appreciated better interoperability in terms of shared model storage, instead of everybody defaulting to downloading everything all over again.

评论 #42888308 未加载

评论 #42890003 未加载

pzo4 months ago

To make it AI really boring all those projects need to be more approachable to non-tech savvy people, e.g. some minimal GUI for searching, listing, deleting, installing ai models. I wish e.g. this or ollama could work more as invisible AI models dependency manager. Right now every app that want to have STT like whisper will bundle such model inside. User waste more memory storage and have to wait to download big models. We had similar problems with and static libraries and then moved to dynamic linking libraries.I wish your app could add some model as dependency and on install would download only if such model is not avialable locally. Also would check if ollama is installed and only bootstrap if also doesn't exist on drive. Maybe with some nice interface for user to confirm download and nice onboarding.

rhatdan4 months ago

One of my primary goals of RamaLama was to allow users to move AI Models into containers, so they can be stored in OCI Registries. I believe there is going to be a proliferation of "private" models, and eventually "private" RAG data. (Working heavily in RAG support in RamaLama now.Once you have private models and RAG, I believe you will want to run these models and data on edge devices in in Kubernetes clusters. Getting the AI Models and data into OCI content. Would allow us to take advantage of content signing, trust, mirroring. And make running the AI in production easier.Also allowing users to block access to outside "untrusted" AI Models stored in the internet. Allow companies to only use "trusted" AI.Since Companies already have OCI registries, it makes sense to store your AI Models and content in the same location.

评论 #42897866 未加载

jerrygenser4 months ago

122 points 2 hours ago yet this is currently #38 and not on the front page.Strange. At the same time I see numerous items that are on the front page posted 2 hours or older with fewer points.I'm willing to take a reputation hit on this meta post. I wonder why this got demoted so quickly from front page despite people clearly voting on it. I wonder if it has anything to do with being backed by YC.I sincerely hope it's just my miss understanding of hn algorithm though

评论 #42889509 未加载

guerrilla4 months ago

> Running in containers eliminates the need for users to configure the host system for AI.When is that a problem?Based on the linked issue in eigenvalue's comment[1], this seems like a very good thing. It sounds like ollama is up to no good and this is a good drop-in replacement. What is the deeper problem being solved here though, about configuring the host? I've not run into any such issue.1. <a href="https://news.ycombinator.com/item?id=42888129">https://news.ycombinator.com/item?id=42888129</a>

评论 #42888982 未加载

2mlWQbCK4 months ago

What benefit does Ollama (or RamaLama) offer over just plain llama.cpp or llamafile? The only thing I understand is that there is automatic downloading of models behind the scenes, but a big reason for me to want to use local models at all is that I want to to know exactly what files I use and keep them sorted and backed up properly, so a tool automatically downloading models and dumping in some cache directory just sounds annoying.

评论 #42889252 未加载

评论 #42898619 未加载

baron-bourbon4 months ago

Does this provide a Ollama compatible API endpoint? I've got at least one other project running that only supports Ollama's API or OpenAI's hosted solution (ie. the API endpoint isn't configurable to use llama.cpp and friends)

评论 #42888687 未加载

glitchc4 months ago

Great, finally an alternative to ollama's convenience.

评论 #42888266 未加载

评论 #42894150 未加载

Y_Y4 months ago

So it's a replacement for Ollama?The killer features of Ollama for me right now are the nice library of quantized models and the ability to automatically start and stop serving models in response to incoming requests and timeouts. The first send to be solved by reusing the Ollama models, but I can't see if the service is possible from my cursory look.

评论 #42888273 未加载

ecurtin4 months ago

I am doing a short talk on this tomorrow at FOSDEM:<a href="https://fosdem.org/2025/schedule/event/fosdem-2025-4486-ramalama-making-working-with-ai-models-boring/" rel="nofollow">https://fosdem.org/2025/schedule/event/fosdem-2025-4486-rama...</a>

wsintra20224 months ago

I’m using openwebui, can this replace ollama in my setup?

n144q4 months ago

It seems that all instructions are based on Mac/Linux? Can someone confirm this works smoothly on Windows?

评论 #42940688 未加载

esafak4 months ago

Is this useful? Can someone help me see the value add here?

评论 #42888124 未加载

评论 #42888173 未加载

评论 #42888143 未加载

评论 #42888285 未加载