Could you detail what you mean by deploying LLMs ? Is it about integrating commercial LLMs in an enterprise context? Or running self-hosted LLM for a small company (e.g. Ollama + Ollama Web UI)? Or integrating an Agentic approach to existing software stack?
Not enough info.<p>Do they want near-realtime responses? Will they all hit it at the same time? Can you put some workloads in an overnight batch queue?