I'm failing to see why k8s needs to be involved here - it's overkill for most model serving cases but its involvement here now adds additional overhead. So it's not really any cloud, it's any cloud where you're running your EKS/AKS etc.
Very off topic, every time I see nvidia expand towards AI products I'm reminded that they had every opportunity to expand towards crypto products and didn't. I like that they work on what they believe in - and skip if they don't. In a time when AI is becoming a buzzword, this feels refreshing.
It would be nice if Nvidia did not enforce artificial driver and legal kneecaps to consumer Geforce cards for cloud usage to prop up their enterprise ones... but shareholder rights come before anyone.
It's more on the framework that you use than nvidia at this point. Anything dockerized works with any compatible underlying hardware with no issues. Any optimization is again fragmented with FasterTransformer or TensorRT conversion with half baked layer supports which lags by 6months or more pretty much.<p>NVAIE license is what nvidia wants enterprises to pay for using their bespoke cards in shared VRAM configuration by knee capping consumer cards which can very well do the same job better with more cuda cores but lesser memory.<p>And don't even get me started on RIVA stack<p>FP8 emulation is also never going to get backported instead only H100 & 4090s can make use of it
The AI shovels industry is doing good business. Other than that, any major use-case behind the recent AI hype? One that has brought tangible benefits, or at the very least a positive ROI.
We need local models for our confidential data. Nvidia, we already can train using OpenAI or a beefy hosted server.<p>But this particular data is air gapped.
Cool!<p>Is the cost AWS level of waste - or something reasonable?<p>I can get an A4000 with 16GB vram which can run some models for 140$ per month.<p>I can't say the setup is anything special really but not having to do that has some value