Ollama releases Python and JavaScript Libraries

607 点作者 adamhowell超过 1 年前

31 条评论

rgbrgb超过 1 年前

Are these libraries for connecting to an ollama service that the user has already installed or do they work without the user installing anything? Sorry for not checking the code but maybe someone has the same question here.I looked at using ollama when I started making FreeChat [0] but couldn't figure out a way to make it work without asking the user to install it first (think I asked in your discord at the time). I wanted FreeChat to be 1-click install from the mac app store so I ended up bundling the llama.cpp server instead which it runs on localhost for inference. At some point I'd love to swap it out for ollama and take advantage of all the cool model pulling stuff you guys have done, I just need it to be embeddable.My ideal setup would be importing an ollama package in swift which would start the server if the user doesn't already have it running. I know this is just js and python to start but a dev can dream :)Either way, congrats on the release![0]: <a href="https://github.com/psugihara/FreeChat">https://github.com/psugihara/FreeChat</a>

评论 #39128303 未加载

评论 #39125804 未加载

评论 #39144727 未加载

ivanfioravanti超过 1 年前

I posted about the Python library few hours after release. Great experience. Easy, fast and works well.I create a GIST with a quick and dirty way of generating a dataset for fine-tuning Mistral model using Instruction Format on a given topic: <a href="https://gist.github.com/ivanfioravanti/bcacc48ef68b02e9b7a4034161824287" rel="nofollow">https://gist.github.com/ivanfioravanti/bcacc48ef68b02e9b7a40...</a>

评论 #39127384 未加载

评论 #39128911 未加载

评论 #39129058 未加载

评论 #39128648 未加载

评论 #39128621 未加载

filleokus超过 1 年前

An off topic question: Is there such a thing as a "small-ish language model". A model that you could simple give instructions / "capabilities" which a user can interact with. Almost like Siri-level of intelligence.Imagine you have an API-endpoint where you can set the level of some lights and you give the chat a system prompt explaining how to build the JSON body of the request, and the user can prompt it with stuff like "Turn off all the lights" or "Make it bright in the bedroom" etc.How low could the memory consumption of such a model be? We don't need to store who the first kaiser of Germany was, "just" enough to kinda map human speech onto available API's.

评论 #39128492 未加载

评论 #39128532 未加载

评论 #39128411 未加载

评论 #39129147 未加载

reacharavindh超过 1 年前

Not directly related to what Ollama aims to achieve. But, I’ll ask nevertheless.Local LLMs are great! But, it would be more useful once we can _easily_ throw our own data for them to use as reference or even as a source of truth. This is where it opens doors that a closed system like OpenAI cannot - I’m never going to upload some data to ChatGPT for them to train on.Could Ollama make it easier and standardize the way to add documents to local LLMs?I’m not talking about uploading one image or model and asking a question about it. I’m referring to pointing a repository of 1000 text files and asking LLMs questions based on their contents.

评论 #39128639 未加载

评论 #39127350 未加载

评论 #39127274 未加载

评论 #39127292 未加载

评论 #39127400 未加载

评论 #39132403 未加载

评论 #39127461 未加载

评论 #39129697 未加载

porridgeraisin超过 1 年前

Used ollama as part of a bash pipeline for a tiny throwaway app.It blocks until there is something on the mic, then sends the wav to whisper.cpp, which then sends it to llama which picks out a structured "remind me" object from it, which gets saved to a text file.

评论 #39127525 未加载

评论 #39127152 未加载

评论 #39127244 未加载

deepsquirrelnet超过 1 年前

I love the ollama project. Having a local llm running as a service makes sense to me. It works really well for my use.I’ll give this Python library a try. I’ve been wanting to try some fine tuning with LLMs in the loop experiments.

palashkulsh超过 1 年前

Noob question, and may be probably being asked at the wrong place. Is there any way to find out min system requirements for running ollama run commands with different models.

评论 #39127042 未加载

评论 #39128265 未加载

评论 #39129165 未加载

评论 #39126811 未加载

评论 #39128254 未加载

评论 #39130806 未加载

评论 #39126836 未加载

评论 #39132580 未加载

评论 #39129672 未加载

sqs超过 1 年前

I posted about my awesome experiences using Ollama a few months ago: <a href="https://news.ycombinator.com/item?id=37662915">https://news.ycombinator.com/item?id=37662915</a>. Ollama is definitely the easiest way to run LLMs locally, and that means it’s the best building block for applications that need to use inference. It’s like how Docker made it so any application can execute something kinda portably kinda safely on any machine. With Ollama, any application can run LLM inference on any machine.Since that post, we shipped experimental support in our product for Ollama-based local inference. We had to write our own client in TypeScript but will probably be able to switch to this instead.

评论 #39126077 未加载

评论 #39126564 未加载

评论 #39125851 未加载

评论 #39127154 未加载

评论 #39125876 未加载

joaomdmoura超过 1 年前

So cool! I have bene using Ollama for weeks now and I just love it! Easiest way to run local LLMs, we are actually embedding them into our product right now and super excited about it!

评论 #39126385 未加载

评论 #39127249 未加载

Kostic超过 1 年前

I used this half a year ago, love the UX but it was not possible to accelerate the workloads using an AMD GPU. How's the support for AMD GPUs under Ollama today?

评论 #39125918 未加载

评论 #39126157 未加载

评论 #39127141 未加载

jquaint超过 1 年前

I'm a huge fan of Ollama. Really like how easy it makes local LLM + neovim <a href="https://github.com/David-Kunz/gen.nvim">https://github.com/David-Kunz/gen.nvim</a>

imrehg超过 1 年前

This should be nice to be easier to integrate with things like Vanna.ai, that was on HN recently.There a bunch of methods need to be implemented to work, but then usual OpenAI buts can be switched out to anything else, e.g. see the code stub in <a href="https://vanna.ai/docs/bigquery-other-llm-vannadb.html" rel="nofollow">https://vanna.ai/docs/bigquery-other-llm-vannadb.html</a>Looking forward to more remixes for other tools too.

hatmanstack超过 1 年前

Why does this feel like an exercise in the high priesting of coding. Shouldn't a python library have everything necessary and work out of the box?

behnamoh超过 1 年前

What I hate about ollama is that it makes server configuration a PITA. ollama relies on llama.cpp to run GGUF models but while llama.cpp can keep the model in memory using `mlock` (helpful to reduce inference times), ollama simply won't let you do that:<a href="https://github.com/ollama/ollama/issues/1536">https://github.com/ollama/ollama/issues/1536</a>Not to mention, they hide all the server configs in favor of their own "sane defaults".

评论 #39125943 未加载

mfalcon超过 1 年前

I love Ollama's simplicity to download and consume different models with its REST API. I've never used it in a "production" environment, anyone knows how Ollama performs? or is it better to move to something like Vllm for that?

评论 #39129007 未加载

评论 #39128626 未加载

techn00超过 1 年前

Does Ollama support GBNF grammars?

评论 #39127491 未加载

pamelafox超过 1 年前

API wise, it looks very similar to the OpenAI python SDK but not quite the same. I was hoping I could swap out one client for another. Can anyone confirm they’re intentionally using an incompatible interface?

评论 #39127228 未加载

评论 #39128839 未加载

评论 #39129046 未加载

WhackyIdeas超过 1 年前

This is going to make my current project a million times easier. Nice.

malux85超过 1 年前

I love ollama, the engine underneath is llama.cpp, and they have the first version of self-extend about to me merged into main, so with any luck it will be available in ollama soon too!

评论 #39127202 未加载

dchuk超过 1 年前

Is anyone using this as an api behind a multi user web application? Or does it need to be fed off of a message queue or something to basically keep it single threaded?

Havoc超过 1 年前

What model format does ollama use? Or is one constrained to the handful of preselected models they list?

评论 #39130292 未加载

awongh超过 1 年前

Wow, I guess I wouldn’t have thought there would be GPU support. What’s the mechanism for this?

评论 #39127199 未加载

cranberryturkey超过 1 年前

`ollama serve` exposes an api you can query with fetch. why the need for a library?

lobocinza超过 1 年前

ollama feels like llama.cpp with extra undesired complexities. It feels like the former project is desperately trying to differentiate and monetize while the latter is where all the things that matter happens.

sjwhevvvvvsj超过 1 年前

Literally wrote an Ollama wrapper class last week. Doh!

bearjaws超过 1 年前

If you're using TypeScript I highly recommend modelfusion <a href="https://modelfusion.dev/guide/" rel="nofollow">https://modelfusion.dev/guide/</a>It is far more robust, integrates with any LLM local or hosted, supports multi-modal, retries, structure parsing using zod and more.

评论 #39127053 未加载

nextlevelwizard超过 1 年前

What is the benefit?Ollama already exposes REST API that you can query with whatever language (or you know, just using curl) - why do I want to use Python or JS?

评论 #39128273 未加载

评论 #39128282 未加载

leansensei超过 1 年前

There is also an Elixir library: <a href="https://overbring.com/blog/2024-01-14-ollamex-ollama-api-embeddings/" rel="nofollow">https://overbring.com/blog/2024-01-14-ollamex-ollama-api-emb...</a>

3Sophons超过 1 年前

The Rust+Wasm stack provides a strong alternative to Python in AI inference.* Lightweight. Total runtime size is 30MB as opposed 4GB for Python and 350MB for Ollama. * Fast. Full native speed on GPUs. * Portable. Single cross-platform binary on different CPUs, GPUs and OSes. * Secure. Sandboxed and isolated execution on untrusted devices. * Modern languages for inference apps. * Container-ready. Supported in Docker, containerd, Podman, and Kubernetes. * OpenAI compatible. Seamlessly integrate into the OpenAI tooling ecosystem.Give it a try --- <a href="https://www.secondstate.io/articles/wasm-runtime-agi/" rel="nofollow">https://www.secondstate.io/articles/wasm-runtime-agi/</a>

评论 #39126406 未加载

评论 #39126433 未加载

jdlyga超过 1 年前

Thanks Ollama

评论 #39126011 未加载

rezonant超过 1 年前

I wish JS libraries would stop using default exports. They are not ergonomic as soon as you want to export one more thing in your package, which includes types, so all but the most trivial package requires multiple exports.Just use a sensibly named export, you were going to write a "how to use" code snippet for the top of your readme anyway.Also means that all of the code snippets your users send you will be immediately sensible, even without them having to include their import statements (assuming they don't use "as" renaming, which only makes sense when there's conflicts anyway)

评论 #39128968 未加载