I think one of the biggest struggles small startups and practitioners are facing is lack of a good option between "I wonder if this works" and "ready for prime time." Running locally is an option with consumer hardware but is cost prohibitive for a team. Cloud providers are full of complications and hidden costs. Tools like Friendli and Bento are good but ambiguous on costs and get difficult to price end-to-end once you need the full stack of options. Hugging Face inference endpoints and other tools still seem like the best option around along with cloud DBs like Zilliz.<p>That said, it's no wonder people just pay extra for the simplicity of a slightly smarter endpoint like OpenAI. Sure, over time the costs are insane and you lack any flexibility to create a truly targeted solution, but it <i>feels</i> like an all-in-one easy fix.
Hi everyone, I put together this survey of tools for the LLM Stack in 2024. I've linked the friend-link for the Medium article in the URL. I'd love feedback from you guys about any tools I've missed.<p>If you're a Medium member and want to support my writing, feel free to use the regular link - <a href="https://medium.com/plain-simple-software/the-llm-app-stack-2024-eac28b9dc1e7" rel="nofollow">https://medium.com/plain-simple-software/the-llm-app-stack-2...</a>
This is great! Out of curiosity, what's the difference between choosing a dedicated vector database vs. a traditional database with vector indices (e.g. pgvector with postgres?