TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Next steps for scaling a scikit-learn Flask ML API

1 点作者 frist45大约 7 年前
We currently have an internal API that&#x27;s core to our business. The models are loaded as .pkl files with scikit-learn joblib and served via Flask w&#x2F; Gunicorn using Gevent. We&#x27;ve tried Tornado as a worker class and Cherrypy as a replacement for Gunicorn -- none produce significant performance benefits.<p>We&#x27;re hosting it in a Kubernetes cluster with really large nodes (140GB). Each container user ~5GB of RAM And considering the response time (~750ms), we can only add about 30 req&#x2F;sec for each node we add ($1.5k). It appears the single request is CPU bound make it difficult to widely scale.<p>This is cost prohibitive and feels like we need to move towards other tools&#x2F;approaches.<p>As the person who&#x27;s managing the infrastructure, I&#x27;m less familiar with the current eco-system of larger-scale tooling. Ideally, the next iteration would keep the HTTP transport layer to allow for minimal changes to the rest of the system.<p>What would be a logical next step for us to scale the existing scikit-learn&#x2F;Flask API?

暂无评论

暂无评论