Looks like LangChain has LangSmith but it's in closed beta.<p>I saw a couple YC launches like Hegel AI.<p>I'm personally interested in deployments in small teams or teams with a lot of freedom to pick and choose their own tooling.
I'm currently writing up a deployment architecture for LLM's and the API question is answered here <a href="https://fine-tuna.com/docs/choosing-a-model/model/" rel="nofollow noreferrer">https://fine-tuna.com/docs/choosing-a-model/model/</a><p>Basically you can get a Docker container that will publish an Open AI API compatible end point. You can then choose the model that sits behind that API.<p>As deployment will be in Kuberenetes we will clusters with GPU resources to maxz out performance but we're not there yet.
We built an AWS serverless app that handles:<p>- Configurable context and cases mapped to a RESTful API<p>- Multi-account and high throughput error handling<p>- DDB backed records of all requests and responses for evaluation, debugging & training<p>- One-click devops deploy<p>Has helped us deploy and maintain LLM apps into production quite easily. Let me know if you would like access to the repo.
Play around with langchain and then convert all of that into decent code. After a few prototypes, you'll realize langchains or other pipelining are just for non-coders. You can architect elegant solutions yourself.