TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Are you saving inference costs on GPUs at your company

5 点作者 idomi5 个月前
I’m currently trying to solve a problem we&#x27;re having, GPUs are expensive! I&#x27;ve been thinking of ways to cut our inference costs at my company and wanted to hear your perspective.<p>Did anyone implement something similar? How did it go? How much time did it save? What was the cost improvement? I recently found this tool in the AWS samples: https:&#x2F;&#x2F;github.com&#x2F;aws-samples&#x2F;scalable-hw-agnostic-inference<p>I&#x27;m wondering if anyone used&#x2F;tried it or other approaches?

1 comment

ricktdotorg5 个月前
i&#x27;ve used GCP GPU Cloud Run to build an on-demand&#x2F;auto scaling livestream&#x2F;HLS video translation --&gt; subtitle generation pipeline with great success.<p>[edit: sorry, not inference, but a great cost-saver]