Ask HN: Are you saving inference costs on GPUs at your company

5 点作者 idomi5 个月前

I’m currently trying to solve a problem we're having, GPUs are expensive! I've been thinking of ways to cut our inference costs at my company and wanted to hear your perspective.<p>Did anyone implement something similar? How did it go? How much time did it save? What was the cost improvement? I recently found this tool in the AWS samples: https://github.com/aws-samples/scalable-hw-agnostic-inference<p>I'm wondering if anyone used/tried it or other approaches?

1 comment

ricktdotorg5 个月前

i've used GCP GPU Cloud Run to build an on-demand/auto scaling livestream/HLS video translation --> subtitle generation pipeline with great success.<p>[edit: sorry, not inference, but a great cost-saver]