TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Are GPUs Worth It for ML?

131 pointsby varunkmohanover 2 years ago

17 comments

PeterisPover 2 years ago
For some reason they focus on the inference, which is the computationally cheap part. If you're working on ML (as opposed to deploying someone else's ML) then almost all of your workload is training, not inference.
评论 #32642201 未加载
评论 #32642348 未加载
评论 #32644880 未加载
评论 #32642866 未加载
评论 #32642455 未加载
评论 #32642661 未加载
评论 #32644238 未加载
评论 #32646285 未加载
评论 #32642801 未加载
评论 #32643437 未加载
scosmanover 2 years ago
We did a big analysis of this a few years back. We ended up using a big spot-instance cluster of CPU machines for our inference cluster. Much more consistently available than spot GPU, at greater scale, and at better price per inference (at least at the time). Scaled well to many billion inferences. Of course, compare cost per inference on your models to make sure logic applies. Article on how it worked: <a href="https:&#x2F;&#x2F;www.freecodecamp.org&#x2F;news&#x2F;ml-armada-running-tens-of-billions-of-ml-predictions-on-a-budget-f9505c820203&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.freecodecamp.org&#x2F;news&#x2F;ml-armada-running-tens-of-...</a><p>Training was always GPUs (for speed), non-spot-instance (for reliability), and cloud based (for infinite parallelism). Training work tended to be chunky, never made sense to build servers in house that would be idle some of the time, and queued at other times.
评论 #32642317 未加载
评论 #32642277 未加载
评论 #32648184 未加载
37ef_ced3over 2 years ago
For small-scale transformer CPU inference you can use, e.g., Fabrice Bellard&#x27;s <a href="https:&#x2F;&#x2F;bellard.org&#x2F;libnc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bellard.org&#x2F;libnc&#x2F;</a><p>Similarly, for small-scale convolutional CPU inference, where you only need to do maybe 20 ResNet-50 (batch size 1) per second per CPU (cloud CPUs cost $0.015 per hour) you can use inference engines designed for this purpose, e.g., <a href="https:&#x2F;&#x2F;NN-512.com" rel="nofollow">https:&#x2F;&#x2F;NN-512.com</a><p>You can expect about 2x the performance of TensorFlow or PyTorch.
评论 #32642151 未加载
Kukumberover 2 years ago
An interesting question, shows how insanely overpriced GPUs still are, specially in the cloud environment
评论 #32642193 未加载
评论 #32642112 未加载
评论 #32642239 未加载
synergy20over 2 years ago
I think TPU is the way to go for ML, be it training or inference.<p>We&#x27;re using GPU(some contains a TPU block inside) due to &#x27;historical reasons&#x27;. With vector unit(x86 AVX, ARM SVE, RISC-V RVV) that is part of the host cpu, either put a TPU on a separate die of the chiplet, or just put it into a PCIe card will do the heavy lift ML job fine. It shall be much cheaper than the GPU model for ML nowadays, unless you are both a PC game player and a ML engineer.
评论 #32645318 未加载
评论 #32646387 未加载
jacquesmover 2 years ago
This is an ad.
mpaepperover 2 years ago
This also very much depends on the inference use case &#x2F; context. For example, I work in deep learning on digital pathology where images can be up to 100000x100000pixels in size and inference needs GPUs as it&#x27;s just way too slow otherwise.
rfreyover 2 years ago
Not related to the article, but how would one begin to become smart on optimizing GPU workloads? I&#x27;ve been charged with deploying an application that is a mixture of heuristic search and inference, that has been exclusively single-user to this point.<p>I&#x27;m sure every little thing I&#x27;ve discovered (e.g. measuring cpu&#x2F;gpu workloads, trying to multiplex access to the gpu, etc) was probably covered in somebody&#x27;s grad school notes 12 years ago, but I haven&#x27;t found a source of info on the topic.
评论 #32643884 未加载
fancyfredbotover 2 years ago
There are some pretty elegant solutions out there for the problem of having the right ratio of CPU to GPU. One of the nicer ones is rCUDA. <a href="https:&#x2F;&#x2F;scholar.google.com&#x2F;citations?view_op=view_citation&amp;hl=es&amp;user=4XgrRlMAAAAJ&amp;citation_for_view=4XgrRlMAAAAJ:zYLM7Y9cAGgC" rel="nofollow">https:&#x2F;&#x2F;scholar.google.com&#x2F;citations?view_op=view_citation&amp;h...</a>
评论 #32643275 未加载
einpoklumover 2 years ago
&gt; And CPUs are so much cheaper<p>Doesn&#x27;t look like it. Consumer:<p>AMD ThreadRipper 3970X: ~3000 USD on NewEgg<p><a href="https:&#x2F;&#x2F;www.newegg.com&#x2F;amd-ryzen-threadripper-2990wx&#x2F;p&#x2F;N82E16819113618?Description=AMD%20Ryzen&amp;cm_re=AMD_Ryzen-_-19-113-618-_-Product&amp;quicklink=true" rel="nofollow">https:&#x2F;&#x2F;www.newegg.com&#x2F;amd-ryzen-threadripper-2990wx&#x2F;p&#x2F;N82E1...</a><p>NVIDIA RTX 3080 Ti Founders&#x27; Edition: ~2000 USD<p><a href="https:&#x2F;&#x2F;www.newegg.com&#x2F;nvidia-900-1g133-2518-000&#x2F;p&#x2F;1FT-0004-006T6?Description=Geforce%20RTX%203080%20Ti%20Founders%20edition&amp;cm_re=Geforce_RTX%203080%20Ti%20Founders%20edition-_-1FT-0004-006T6-_-Product&amp;quicklink=true" rel="nofollow">https:&#x2F;&#x2F;www.newegg.com&#x2F;nvidia-900-1g133-2518-000&#x2F;p&#x2F;1FT-0004-...</a><p>For servers, a comparison is even more complicated and it wouldn&#x27;t be fair to just give two numbers, but I still don&#x27;t think GPUs are more expensive.<p>... besides, none of that may matter if yours is a power budget.
评论 #32646159 未加载
ummonkover 2 years ago
What a clickbaity article. It’s an interesting discussion of GPU multiplexing for ML inference merged together with a sales pitch but the clickbait title made me hate the article bait and switch. This wasn’t even an example of Betteridge’s law but just completely misleading headline.
评论 #32644021 未加载
PeterStuerover 2 years ago
&quot; It feels wasteful to have an expensive GPU sitting idle while we are executing the CPU portions of the ML workflow&quot;<p>What is expensive? Those 3090ti&#x27;s are looking very tasteful at current prices.
andrewmutzover 2 years ago
At training time they sure are. The only thing more expensive than fancy GPUs are the ML engineers whose productivity that are improving.
jimmygrapesover 2 years ago
Perhaps it&#x27;s been mentioned before but I do find it curious how often crypto mining was lambasted for contributing to climate change get I haven&#x27;t seen anybody bat an eye at a fairly similar amount of compute power used for ML applications. Makes me wonder.
评论 #32653685 未加载
triknomeisterover 2 years ago
I thought this post would be about how ASICs are probably a better bet.
sabotistaover 2 years ago
It depends a lot on your problem, of course.<p>Game-playing (e.g. AlphaGo) is computationally hard but the rules are immutable, target functions (e.g., heuristics) don’t change much, and you can generate arbitrarily sized clean data sets (play more games). On these problems, ML-scaling approaches work very well. For business problems where the value of data decays rapidly, though, you probably don’t need the power of a deer or complex neural net with millions of parameters, and expensive specialty hardware probably isn’t worth it.
rvzover 2 years ago
Not only the end result of these deep learning models can be tricked over a single pixel or get confused by malicious input and becomes useless, Deep Learning training, retraining, fine tuning on GPUs, TPUs, all running in a data center contribute significantly to burning up the planet and driving up costs which the models are just used for nothing but surveillance on our own data.<p>If it doesn&#x27;t work it has to be retrained on new data again and there are no efficient alternatives to this energy waste other than use more GPUs, TPUs, etc emitting more CO2 after years of Deep Learning existing.<p>A complete waste of resources and energy. Therefore it is not worth it at all.
评论 #32646762 未加载