TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Python Bindings for llama.cpp with some CLIs

2 点作者 tantony大约 2 年前
These are my Python bindings for @ggerganov&#x27;s llama.cpp. My Python bindings build on this work and provide an easy-to-use interface for Python developers to take advantage of LlamaCPP&#x27;s powerful inference capabilities.<p>The bindings currently use code from a pending PR of mine to make the original code into more of a library. Hopefully it will get merged into the main repository soon. I have also added a few CLI entry points that get installed along with the python package:<p>* llamacpp-convert - convert pytorch models into GGML format. This is an alias for the existing Python script in llama.cpp and requires PyTorch<p>* llamacpp-quantize - Perform INT4 quantization on the GGML mode. This is a wrapper for the &quot;quantize&quot; C++ program from the original repository and has no dependencies.<p>* llamacpp-cli - This is a Python version of the &quot;main.cpp&quot; program from the original repository that utilizes the bindings.<p>* llamacpp-chat - A wrapper over llamacpp-cli that includes a prompt that makes it behave like a chatbot. This is not very good as of right now.<p>You should theoretically be able to do &quot;pip install llamacpp&quot; and get going on most linux&#x2F;macOS platforms by just running `llamacpp-cli`. I do not have Windows builds on the CI yet and you may have to build it yourself.<p>The package has no dependencies if you just want to run inference on the models.

暂无评论

暂无评论