Nvidia Announces H100 NVL – Max Memory Server Card for Large Language Models

122 点作者 neilmovva大约 2 年前

17 条评论

neilmovva大约 2 年前

A bit underwhelming - H100 was announced at GTC 2022, and represented a huge stride over A100. But a year later, H100 is still not generally available at any public cloud I can find, and I haven't yet seen ML researchers reporting any use of H100.The new "NVL" variant adds ~20% more memory per GPU by enabling the sixth HBM stack (previously only five out of six were used). Additionally, GPUs now come in pairs with 600GB/s bandwidth between the paired devices. However, the pair then uses PCIe as the sole interface to the rest of the system. This topology is an interesting hybrid of the previous DGX (put all GPUs onto a unified NVLink graph), and the more traditional PCIe accelerator cards (star topology of PCIe links, host CPU is the root node). Probably not an issue, I think PCIe 5.0 x16 is already fast enough to not bottleneck multi-GPU training too much.

评论 #35249847 未加载

评论 #35252484 未加载

评论 #35250314 未加载

评论 #35257612 未加载

ecshafer大约 2 年前

I was wondering today if we would start to see the reverse of this. Small ASICS or some kind of optimized for LLM Gpu for desktop / or maybe even laptops of mobile. It is evident I think that LLM are here to stay and will be a major part of computing for a while. Getting this local, so we aren't reliant on clouds would be a huge boon for personal computing. Even if its a "worse" experience, being able to load up an LLM into our computer, tell it to only look at this directory and help out would be cool.

评论 #35251931 未加载

评论 #35251383 未加载

评论 #35253145 未加载

评论 #35252253 未加载

评论 #35251343 未加载

enlyth大约 2 年前

Please give us consumer cards with more than 24GB VRAM, Nvidia.It was a slap in the face when the 4090 had the same memory capacity as the 3090.A6000 is 5000 dollars, ain't no hobbyist at home paying for that.

评论 #35251414 未加载

评论 #35252946 未加载

andrewstuart大约 2 年前

GPUs are going to be weird, underconfigured and overpriced until there is real competition.Whether or not there is real competition depends entirely on whether Intels Arc line of GPUs stays in the market.AMD strangely has decided not to compete. Its newest GPU the 7900 XTX is an extremely powerful card, close to the top of the line Nvidia RTX 4090 in raster performance.If AMD had introduced it with an aggressively low price then then they could have wedged Nvidia, which is determinbed to exploit it's market dominance by squeezing the maximum money out of buyers.Instead, AMD has decided to simply follow Nvidia in squeezing for maximum prices, with AM prices slightly behind Nvidia.It's a strange decision from AMD who is well behind in market and apparently seems disinterested in increasing that market share by competing aggressively.So a third player is needed - Intel - it's alot harder for three companies to sit on outrageously high prices for years rather than compete with each other for market share.

评论 #35252075 未加载

评论 #35254198 未加载

评论 #35251832 未加载

brucethemoose2大约 2 年前

The really interesting upcoming LLM products are from AMD and Intel... with catches.- The Intel Falcon Shores XPU is basically a big GPU that can use DDR5 DIMMS directly, hence it can fit absolutely enormous models into a single pool. But it has been delayed to 2025 :/- AMD have not mentioned anything about the (not delayed) MI300 supporting DIMMs. If it doesn't, its capped to 128GB, and its being marketed as an HPC product like the MI200 anyway (which you basically cannot find on cloud services).Nvidia also has some DDR5 grace CPUs, but the memory is embedded and I'm not sure how much of a GPU they have. Other startups (Tenstorrent, Cerebras, Graphcore and such) seemed to have underestimated the memory requirements of future models.

评论 #35252489 未加载

评论 #35251642 未加载

int_19h大约 2 年前

I wonder how soon we'll see something tailored specifically for local applications. Basically just tons of VRAM to be able to load large models, but not bleeding edge perf. And eGPU form factor, ideally.

评论 #35252514 未加载

评论 #35251164 未加载

aliljet大约 2 年前

I'm super duper curious if there are ways to glob together VRAM between consumer-grade hardware to make this whole market more accessible to the common hacker?

评论 #35251689 未加载

评论 #35253520 未加载

metadat大约 2 年前

How is this card (which is really two physical cards occupying 2 PCIe slots) exposed to the OS? Does it show up as a single /dev/gfx0 device, or is the unification a driver trick?

评论 #35251043 未加载

sargun大约 2 年前

What exactly is an SXM5 socket? It sounds like a PCIe competitor, but proprietary to nvidia. Looking at it, it seems specific to nvidia DGX (mother?)boards. Is this just a "better" alternative to PCIe (with power delivery, and such), or fundamentally a new technology?

评论 #35252864 未加载

评论 #35252098 未加载

tromp大约 2 年前

The TDP row in the comparison table must be in error. It shows the card with dual GH100 GPUs at 700W and the one with a single GH100 GPU at 700-800W ?!

评论 #35250271 未加载

0xbadc0de5大约 2 年前

So it's essentially two H100's in a trenchcoat? (plus a sprinkling of "latest")

ipsum2大约 2 年前

I would sell a kidney for one of these. It's basically impossible to train language models on a consumer 24GB card. The jump up is the A6000 ADA, at 48GB for $8,000. This one will probably be priced somewhere in the $100k+ range.

评论 #35252547 未加载

评论 #35252286 未加载

eliben大约 2 年前

NVIDIA is selling shovels in a gold rush. Good for them. Their P/E of 150 is frightening, though.

jiggawatts大约 2 年前

I was just saying to a colleague the day before this announcement that the inevitable consequence of the popularity of large language models will be GPUs with more memory.Previously, GPUs were designed for gamers, and no game really "needs" more than 16 GB of VRAM. I've seen reviews of the A100 and H100 cards saying that the 80GB is ample for even the most demanding usage.Now? Suddenly GPUs with 1 TB of memory could be immediately used, at scale, by deep-pocket customers happy to throw their entire wallets at NVIDIA.This new H100 NVL model is a Frankenstein's monster stitched together from whatever they had lying around. It's a desperate move to corner the market early as possible. It's just the beginning, a preview of the times to come.There will be a new digital moat, a new capitalist's empire, built upon on the scarcity of cards "big enough" to run models that nobody but a handful of megacorps can afford to train.In fact, it won't be enough to restrict access by making the models expensive to train. The real moat will be models too expensive to run. Users will have to sign up, get API keys, and stand in line."Safe use of AI" my ass. Safe profits, more like. Safe monopolies, safe from competition.

g42gregory大约 2 年前

I wonder how this compares to AMD Instinct MI300 128GB HBM3 cards?

tpmx大约 2 年前

Does AMD have a chance here in the short term (say 24 months)?

评论 #35251922 未加载

garbagecoder大约 2 年前

Sarah Connor is totally coming for NVIDIA.