TechEcho

Partnering with the best LLM providers in the world (because we love them all), and we see two gaps: - The token based pricing model - There is no agreed and universal common API (OpenAI has a slightly different API than google and same with anthropic).So, we are running an experiment, a Poly-LLM Service(you can play with it and read more at https://mdb.ai/llm-serve).What's exciting? - Unlimited tokens!!!To make LLM-based applications more viable, we're exploring the possibilities of a world where we developers don't have to worry about price per token. Instead, as developers we want to focus on identifying the LLM that best meets our needs, considering various trade-offs such as throughput, context window size, and domain knowledge (e.g., coding, logic, etc.). For this small release, we are starting with GPT-3.5-Turbo and comparable models, mostly because they have a great balance between quality, throughput and context window, making them viable for many future production applications!- A single universal API for the most popular LLMs: The ability to query all models, including anthropic and gemini models, through the same openAI completions API standard, this helps maintain clean code. We handle the translation and brokering for you. In this release, we support models, like: GPT-3.5, LLama2-70b, CodeLlama-70b, Mixtral, gemini-pro, dbrx, etc (a complete list https://docs.mdb.ai/docs/api/models), although the idea is not new, we felt it was important for there to be a service that was not opinionated between open source and closed source modelsOnce again, we look forward to hearing your ideas and feedback.

LLMs: What if we did unlimited tokens instead?

no comments

LLMs: What if we did unlimited tokens instead?

no comments