TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Banana.dev Serverless GPUs for Machine Learning Inference

16 pointsby votickalmost 3 years ago

2 comments

votickalmost 3 years ago
author here:<p>hey HN, we use used to run an ML consultancy for a year that helped companies build &amp; host models in prod. We learned how tedious &amp; expensive it was to host ML. Customer models had to run on a fleet of always-on GPUs that would often get &lt;10% utilization, which felt like a big money sink.<p>Over time we built infrastructure to improve GPU utilization. Six months ago we made a pivot to focus solely on productizing this infra into a hosting platform for ML teams to use that would remove the pain of deployment and reduce the cost of hosting models.<p>We deploy on A100 GPUs, and you pay per second of inference. If you aren’t running inferences you pay nothing. Couple points to clarify: Yes, the models are actually cold-booted, we aren’t just running them in the background. We boot models faster due to how we manage OS memory. Yes, there is still cold-boot time, it’s not instant but it’s significantly faster (e.g., 15 seconds instead of 10 minutes for some transformers like GPTJ).<p>Lastly, model quality is not lost on Banana because we aren’t doing traditional weight quantization or network pruning which makes networks smaller&#x2F;faster but sacrifices quality. You can think of Banana more as a compiler + hosting platform. We break down your code to run faster on GPUs.<p>Try it out and let us know what you think!
merwanedralmost 3 years ago
This is really cool, but I can&#x27;t wait for a classic HN comment like:<p>-HN midwit: &quot;Who names a company after a fruit?&quot;. -Erik and Kyle: &quot;Well...&quot;
评论 #32183104 未加载