TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Collider – the platform for local LLM debug and inference at warp speed

3 点作者 Ambix超过 1 年前
ChatGPT turns one today :)<p>What a day to launch the project I&#x27;m tinkering with for more than half a year. Welcome new LLM platform suited both for individual research and scaling AI services in production.<p>GitHub: <a href="https:&#x2F;&#x2F;github.com&#x2F;gotzmann&#x2F;collider">https:&#x2F;&#x2F;github.com&#x2F;gotzmann&#x2F;collider</a><p>Some superpowers:<p>- Built with performance and scaling in mind thanks Golang and C++<p>- No more problems with Python dependencies and broken compatibility<p>- Most of modern CPUs are supported: any Intel&#x2F;AMD x64 platofrms, server and Mac ARM64<p>- GPUs supported as well: Nvidia CUDA, Apple Metal, OpenCL cards<p>- Split really big models between a number of GPU (warp LLaMA 70B with 2x RTX 3090)<p>- Not bad performance on shy CPU machines, fast as hell inference on monsters with beefy GPUs<p>- Both regular FP16&#x2F;FP32 models and their quantised versions are supported - 4-bit really rocks!<p>- Popular LLM architectures already there: LLaMA, Starcoder, Baichuan, Mistral, etc...<p>- Special bonus: proprietary Janus Sampling for code generation and non English languages

1 comment

smcleod超过 1 年前
Tip: Winter and “Fall” don’t really mean much when you’re talking about something global like software or media milestones - I’d suggest stating either Q1&#x2F;Q2 etc… or early &#x2F; late.