Hey HN,<p>We've been building a multi-model API for LLMs called OpenRouter since we launched the Window AI extension in April this year [0].<p>The events of the last week make it clear the LLM landscape is unpredictable, so LLM aggregators are picking up interest. Unlike others, we're making a router on top of a public, explorable dataset.<p>We recently built a way to rank and visualize this data, including the token counts we see going to and from different models, both open-sourced (like Llama, Mistral, finetunes, and variants) and closed-sourced (OpenAI, Anthropic).<p>The API supports
- 50+ different models: [0]
- Consolidated payments for all models
- OAuth, so users can pay for usage directly
- Upstream latency/throughput tracking
- Multiple providers per model, for redundancy (downtime happens!)
- Prompt compression, so you don't have to worry as much about context length<p>Docs: [1]<p>We support 11 model hosts, including our own, based on vLLM, which we've just open-sourced: [2]<p>Some users have opted-in to sharing their prompts, which will soon allow us to show which models are best for different tasks.<p>Let us know your feedback, and if you've worked on a similar problem before!<p>Alex and Louis<p>[0] <a href="https://news.ycombinator.com/item?id=35481760">https://news.ycombinator.com/item?id=35481760</a>
[1] <a href="https://openrouter/models" rel="nofollow noreferrer">https://openrouter/models</a>
[2] <a href="https://openrouter.ai/docs" rel="nofollow noreferrer">https://openrouter.ai/docs</a>
[3] <a href="https://github.com/OpenRouterTeam/openrouter-runner">https://github.com/OpenRouterTeam/openrouter-runner</a>