Hi HN!
These days I've been getting into Python model deployment and haven't really found a great solution for my specific problem yet.
The use case is that there are thousands of models trained using scikit-learn (efficient prediction, low footprint) that I want to deploy.<p>Most solutions seem to expose an API per model, but I think in my case this is unnecessary:
in principle, each model will be called every hour (or other predefined frequency), and its prediction stored in a database. From there the user/dashboard will access the predictions as well as the time series they come from.<p>I could deploy docker containers per model, but given that 99.9% of the time there is no work to do but wait for new values, I'm thinking there could be a smarter way. Something along the lines of a queue from which the models are executed at every timestep.<p>I do want to avoid a single point of failure, need to be able to version and manage the models after deployment, and need to be able to deploy this in the Azure ecosystem. Does anyone have experience doing anything like this?
I'm looking into things like seldon.io and Databricks, which look like they can provide what I need, but am thinking they may be too much?