The impact of competition and DeepSeek on Nvidia

655 点作者 eigenvalue4 个月前

68 条评论

dang3 个月前

Related ongoing thread:Nvidia’s $589B DeepSeek rout - <a href="https://news.ycombinator.com/item?id=42839650">https://news.ycombinator.com/item?id=42839650</a> - Jan 2025 (574 comments)

pjdesno3 个月前

The description of DeepSeek reminds me of my experience in networking in the late 80s - early 90s.Back then a really big motivator for Asynchronous Transfer Mode (ATM) and fiber-to-the-home was the promise of video on demand, which was a huge market in comparison to the Internet of the day. Just about all the work in this area ignored the potential of advanced video coding algorithms, and assumed that broadcast TV-quality video would require about 50x more bandwidth than today's SD Netflix videos, and 6x more than 4K.What made video on the Internet possible wasn't a faster Internet, although the 10-20x increase every decade certainly helped - it was smarter algorithms that used orders of magnitude less bandwidth. In the case of AI, GPUs keep getting faster, but it's going to take a hell of a long time to achieve a 10x improvement in performance per cm^2 of silicon. Vastly improved training/inference algorithms may or may not be possible (DeepSeek seems to indicate the answer is "may") but there's no physical limit preventing them from being discovered, and the disruption when someone invents a new algorithm can be nearly immediate.

评论 #42842771 未加载

评论 #42842743 未加载

评论 #42843892 未加载

评论 #42843876 未加载

评论 #42845901 未加载

评论 #42842602 未加载

评论 #42849069 未加载

评论 #42845896 未加载

评论 #42844193 未加载

评论 #42845302 未加载

breadwinner3 个月前

Great article but it seems to have a fatal flaw.As pointed out in the article, Nvidia has several advantages including:<pre><code> - Better Linux drivers than AMD - CUDA - pytorch is optimized for Nvidia - High-speed interconnect </code></pre> Each of the advantages is under attack:<pre><code> - George Hotz is making better drivers for AMD - MLX, Triton, JAX: Higher level abstractions that compile down to CUDA - Cerbras and Groq solve the interconnect problem </code></pre> The article concludes that NVIDIA faces an unprecedented convergence of competitive threats. The flaw in the analysis is that these threats are not unified. Any serious competitor must address ALL of Nvidia's advantages. Instead Nvidia is being attacked by multiple disconnected competitors, and each of those competitors is only attacking one Nvidia advantage at a time. Even if each of those attacks are individually successful, Nvidia will remain the only company that has ALL of the advantages.

评论 #42835819 未加载

评论 #42838743 未加载

评论 #42835938 未加载

评论 #42835894 未加载

评论 #42840082 未加载

评论 #42894987 未加载

评论 #42838294 未加载

评论 #42838417 未加载

评论 #42843487 未加载

评论 #42843178 未加载

评论 #42835975 未加载

评论 #42840838 未加载

评论 #42843877 未加载

评论 #42835983 未加载

评论 #42840582 未加载

评论 #42870660 未加载

fairity3 个月前

DeepSeek just further reinforces the idea that there is a first-move disadvantage in developing AI models.When someone can replicate your model for 5% of the cost in 2 years, I can only see 2 rational decisions:1) Start focusing on cost efficiency today to reduce the advantage of the second mover (i.e. trade growth for profitability)2) Figure out how to build a real competitive moat through one or more of the following: economies of scale, network effects, regulatory captureOn the second point, it seems to me like the only realistic strategy for companies like OpenAI is to turn themselves into a platform that benefits from direct network effects. Whether that's actually feasible is another question.

评论 #42843243 未加载

评论 #42842592 未加载

评论 #42842674 未加载

评论 #42843444 未加载

评论 #42843828 未加载

UncleOxidant3 个月前

Even if DeepSeek has figured out how to do more (or at least as much) with less, doesn't the Jevons Paradox come into play? GPU sales would actually increase because even smaller companies would get the idea that they can compete in a space that only 6 months ago we assumed would be the realm of the large mega tech companies (the Metas, Googles, OpenAIs) since the small players couldn't afford to compete. Now that story is in question since DeepSeek only has ~200 employees and claims to be able to train a competitive model for about 20X less than the big boys spend.

评论 #42844641 未加载

评论 #42844780 未加载

评论 #42846543 未加载

评论 #42844943 未加载

评论 #42844850 未加载

评论 #42845106 未加载

colinnordin3 个月前

Great article.>Now, you still want to train the best model you can by cleverly leveraging as much compute as you can and as many trillion tokens of high quality training data as possible, but that's just the beginning of the story in this new world; now, you could easily use incredibly huge amounts of compute just to do inference from these models at a very high level of confidence or when trying to solve extremely tough problems that require "genius level" reasoning to avoid all the potential pitfalls that would lead a regular LLM astray.I think this is the most interesting part. We always knew a huge fraction of the compute would be on inference rather than training, but it feels like the newest developments is pushing this even further towards inference.Combine that with the fact that you can run the full R1 (680B) distributed on 3 consumer computers [1].If most of NVIDIAs moat is in being able to efficiently interconnect thousands of GPUs, what happens when that is only important to a small fraction of the overall AI compute?[1]: <a href="https://x.com/awnihannun/status/1883276535643455790" rel="nofollow">https://x.com/awnihannun/status/1883276535643455790</a>

评论 #42841641 未加载

评论 #42842767 未加载

评论 #42843454 未加载

评论 #42842228 未加载

评论 #42841839 未加载

simonw3 个月前

This is excellent writing.Even if you have no interest at all in stock market shorting strategies there is plenty of meaty technical content in here, including some of the clearest summaries I've seen anywhere of the interesting ideas from the DeepSeek v3 and R1 papers.

评论 #42836800 未加载

andrewgross3 个月前

> The beauty of the MOE model approach is that you can decompose the big model into a collection of smaller models that each know different, non-overlapping (at least fully) pieces of knowledge.I was under the impression that this was not how MoE models work. They are not a collection of independent models, but instead a way of routing to a subset of active parameters at each layer. There is no "expert" that is loaded or unloaded per question. All of the weights are loaded in VRAM, its just a matter of which are actually loaded to the registers for calculation. As far as I could tell from the Deepseek v3/v2 papers, their MoE approach follows this instead of being an explicit collection of experts. If thats the case, theres no VRAM saving to be had using an MOE nor an ability to extract the weights of the expert to run locally (aside from distillation or similar).If there is someone more versed on the construction of MoE architectures I would love some help understanding what I missed here.

评论 #42836829 未加载

j7ake3 个月前

This was an amazing summary of the landscape of ML currently.I think the title does the article injustice, or maybe it’s too long for people to read to appreciate it (eg the deepseek stuff can be an article within itself).Whatever the ones with longer attention span will benefit from this read.Thanks for summarising this up!

评论 #42835346 未加载

评论 #42834190 未加载

评论 #42832961 未加载

lxgr3 个月前

Man, do I love myself a deep, well-researched long-form contrarian analysis published as a tangent of an already niche blog on a Sunday evening! The old web isn't dead yet :)

评论 #42836889 未加载

liuliu3 个月前

This is a humble and informed acrticle (comparing to others written by financial analysts the past a few days). But still have the flaw of over-estimating efficiency of deploying a 687B MoE model on commodity hardware (to use locally, cloud providers will do efficient batching and it is different): you cannot do that on any single Apple hardware (need to at least hook up 2 M2 Ultra). You can barely deploy that on desktop computers just because non-register DDR5 can have 64GiB per stick (so you are safe with 512 RAM). Now coming to PCIe bandwidth: 37B per token activation means exactly that, each activation requires new set of 37B weights, so you need to transfer 18GiB per token into VRAM (assuming 4-bit quant). PCIe 5 (5090) have 64GB/s transfer speed so your upper bound is limited to 4 tok/s with a well balanced propose built PC (and custom software). For programming tasks that usually requires ~3000 tokens for thinking, we are looking at 12 mins per interaction.

评论 #42844711 未加载

hn_throwaway_993 个月前

I'm curious if someone more informed than me can comment on this part:> Besides things like the rise of humanoid robots, which I suspect is going to take most people by surprise when they are rapidly able to perform a huge number of tasks that currently require an unskilled (or even skilled) human worker (e.g., doing laundry ...I've always said that the real test for humanoid AI is folding laundry, because it's an incredibly difficult problem. And I'm not talking about giving a machine clothing piece-by-piece flattened so it just has to fold, I'm talking about saying to a robot "There's a dryer full of clothes. Go fold it into separate piles (e.g. underwear, tops, bottoms) and don't mix the husband's clothes with the wife's". That is, something most humans in the developed world have to do a couple times a week.I've been following some of the big advances in humanoid robot AI, but the above task still seems miles away given current tech. So is the author's quote just more unsubstantiated hype that I'm constantly bombarded with in the AI space, or have there been advancements recently in robot AI that I'm unaware of?

评论 #42845732 未加载

评论 #42845931 未加载

评论 #42844507 未加载

brandonpelfrey3 个月前

Great article. I still feel like very few people are viewing the Deepseek effects in the right light. If we are 10x more efficient it's not that we use 1/10th the resources we did before, we expand to have 10x the usage we did before. All technology products have moved this direction. Where there is capacity, we will use it. This argument would not work if we were close to AGI or something and didn't need more, but I don't think we're actually close to that at all.

评论 #42841665 未加载

评论 #42841898 未加载

评论 #42841891 未加载

评论 #42843912 未加载

评论 #42845651 未加载

评论 #42843072 未加载

skizm3 个月前

I'm wondering if there's a (probably illegal) strategy in the making here:<pre><code> - Wait till NVDA rebounds in price. - Create an OpenAI "competitor" that is powered by Llama or a similar open weights model. - Obscure the fact that the company runs on this open tech and make it seem like you've developed your own models, but don't outright lie. - Release an app and whitepaper (whitepaper looks and sounds technical, but is incredibly light on details, you only need to fool some new-grad stock analysts). - Pay some shady click farms to get your app to the top of Apples charts (you only need it to be there for like 24 hours tops). - Collect profits from your NVDA short positions.</code></pre>

评论 #42842681 未加载

评论 #42842587 未加载

snowmaker3 个月前

This is an excellent article, basically a patio11 / matt levine level breakdown of what's happening with the GPU market.

评论 #42836596 未加载

naiv3 个月前

I used to own several adult companies in the past. Incredible huge margins and then along came Pornhub and we could barely survive after it as we did not adapt.With Deepseek this is now the 'Pornhub of AI' moment. Adapt or die.

评论 #42839588 未加载

评论 #42840051 未加载

typeofhuman3 个月前

I'm rooting for DeepSeek (or any competitor) against OpenAI because I don't like Sam Altman. I'm confident in admitting it.

评论 #42842000 未加载

评论 #42847023 未加载

pavelstoev3 个月前

English economist William Stanley Jevons vs the author of the article.Will NVIDIA be in trouble because of DSR1 ? Interpreting Jevon’s effect, if LLMs are “steam engines” and DSR1 brings 90% efficiency improvement for the same performance, more of it will be deployed. This is not considering the increase due to <think> tokens.More NVIDIA GPUs will be sold to support growing use cases of more efficient LLMs.

chvid3 个月前

For sure NVIDIA is priced for perfection perhaps more than any of the other of similar market value.I think two threats are the biggest:First Apple. TSMC’s largest customer. They are already making their own GPUs for their data centers. If they were to sell these to others they would be a major competitor.You would have the same GPU stack on your on phone, laptop, pc, and data center. Already big developer mind share. Also useful in a world where LLMs run (in part) on the end user’s local machine (like Apple Intelligence).Second is China - Huawei, Deepseek etc.Yes - there will be no GPUs from Huawei in the US in this decade. And the Chinese won’t win in a big massive battle. Rather it is going to be death by a thousand cuts.Just as what happened with the Huawei Mate 60. It is only sold in China but today Apple is loosing business big time in China.In the same manner OpenAi and Microsoft will have their business hurt by Deepseek even if Deepseek was completely banned in the west.Likely we will see news on Chinese AI accelerators this year and I wouldn’t be surprised if we soon saw Chinese hyperscalars offering cheaper GPU cloud compute than the west due to a combination of cheaper energy, labor cost, and sheer scale.Lastly AMD is no threat to NVIDIA as they are far behind and follow the same path with little way of differentiating themselves.

mgraczyk3 个月前

The beginning of the article was good, but the analysis of DeepSeek and what it means for Nvidia is confused and clearly out of the loop.<pre><code> * People have been training models at <fp32 precision for many years, I did this in 2021 and it was already easy in all the major libraries. * GPU FLOPs are used for many things besides training the final released model. * Demand for AI is capacity limited, so it's possible and likely that increasing AI/FLOP would not substantially reduce the price of GPUs</code></pre>

评论 #42838301 未加载

评论 #42838223 未加载

breadwinner3 个月前

Part of the reason Musk, Zuckerberg, Ellison, Nadella and other CEOs are bragging about the number of GPUs they have (or plan to have) is to attract talent.Perplexity CEO says he tried to hire an AI researcher from Meta, and was told to ‘come back to me when you have 10,000 H100 GPUs’See <a href="https://www.businessinsider.nl/ceo-says-he-tried-to-hire-an-ai-researcher-from-meta-and-was-told-to-come-back-to-me-when-you-have-10000-h100-gpus/" rel="nofollow">https://www.businessinsider.nl/ceo-says-he-tried-to-hire-an-...</a>

评论 #42837819 未加载

评论 #42839428 未加载

评论 #42840820 未加载

lxgr3 个月前

The most important part for me is:> DeepSeek is a tiny Chinese company that reportedly has under 200 employees. The story goes that they started out as a quant trading hedge fund similar to TwoSigma or RenTec, but after Xi Jinping cracked down on that space, they used their math and engineering chops to pivot into AI research.I guess now we have the answer to the question that countless people have already asked: Where could we be if we figured out how to get most math and physics PhDs to work on things other than picking up pennies in front of steamrollers (a.k.a. HFT) again?

评论 #42844394 未加载

评论 #42844262 未加载

评论 #42844461 未加载

eigenvalue3 个月前

Sorry, my blog crashed! Had a stupid bug where it was calling GitHub too frequently to pull in updated markdown for the posts and kept getting rate limits. Had to rewrite it but it should be much better now.

indymike3 个月前

This story could be applied to every tech breakthrough. We start where the breakthrough is moated by hardware, access to knowledge, and IP. Over time:- Competition gets crucial features into cheaper hardware- Work-arounds for most IP are discovered- Knowledge finds a way out of the castleThis leads to a "Cambrian explosion" of new devices and software that usually gives rise to some game-changing new ways to use the new technology. I'm not sure where we all thought this somehow wouldn't apply to AI. We've seen the pattern with almost every new technology you can think of. It's just how it works. Only the time it takes for patents to expire changes this... so long as everyone respects the patent.

评论 #42843976 未加载

评论 #42848238 未加载

arcanus3 个月前

> Amazon gets a lot of flak for totally bungling their internal AI model development, squandering massive amounts of internal compute resources on models that ultimately are not competitive, but the custom silicon is another matterJuicy. Anyone have a link or context to this? I'd not heard of this reception to NOVA and related.

评论 #42836304 未加载

wtcactus3 个月前

To me, this seems like we are back again in 1953 and a company just announced they are now capable of building one of IBM's 5 computers for 10% of the price.I really don't understand the rationale of "We can now train GPT 4o for 10% the price, so that will bring demand for GPUs down.". If I can train GPT 4o for 10% the price, and I have a budget of 1B USD, that means I'm now going to use the same budget and train my model for 10x as long (or 10x bigger).At the same time, a lot of small players that couldn't properly train a model before, because the starting point was simply out of their reach, will now be able to purchase equipment that's capable of something of note, and they will buy even more GPUs.P.S. Yes, I know that the original quote "I think there is a world market for maybe five computers", was taken out of context.P.S.S. In this rationale, I'm also operating under the assumption that Deepseek numbers are real. Which, given the track record of Chinese companies, is probably not true.

jwan5843 个月前

The point about using FP32 for training is wrong. Mixed precision (FP16 multiplies, FP32 accumulates) has been use for years – the original paper came out in 2017.

评论 #42840250 未加载

dartos3 个月前

This just in.Competition lowers the value of monopolies.

gnlrtntv3 个月前

> While Apple's focus seems somewhat orthogonal to these other players in terms of its mobile-first, consumer oriented, "edge compute" focus, if it ends up spending enough money on its new contract with OpenAI to provide AI services to iPhone users, you have to imagine that they have teams looking into making their own custom silicon for inference/trainingThis is already happening today. Most of the new LLM features announced this year are primarily on-device, using the Neural Engine, and the rest is in Private Cloud Compute, which is also using Apple-trained models, on Apple hardware.The only features using OpenAI for inference are the ones that announce the content came from ChatGPT.

评论 #42835689 未加载

uncletaco3 个月前

When he says better linux drivers than AMD he's strictly talking about for AI, right? Because for video the opposite has been the case for as far back as I can remember.

评论 #42840270 未加载

suraci3 个月前

DeepSeek is not the black swanNVDA was overpriced a lot already even without r1, the market is full of air GPUs hiding in the capex of tech giants like MSFT.If orders are canceled or delivery fails for any reason, NVDA’s EPS would be pulled back to its fundamentally justified levelor if all those air GPUs are produced and delivered in recent years, and the demand keeps rising? well, that will be a crazy world thenit's a finance game, not related with the real world

plaidfuji3 个月前

This is such a great read. The only missing facet of discussion here is that there is a valuation level of NVDA such that it would tip the balance of military action by China against Taiwan. TSMC can only drive so much global value before the incentive to invade becomes irresistible. Unclear where that threshold is; if we’re being honest, could be any day.

mackid3 个月前

Microsoft did a bunch of research into low-bit weights for models. I guess OAI didn’t look at this work.<a href="https://proceedings.neurips.cc/paper/2020/file/747e32ab0fea7fbd2ad9ec03daa3f840-Paper.pdf" rel="nofollow">https://proceedings.neurips.cc/paper/2020/file/747e32ab0fea7...</a>

highfrequency3 个月前

The R1 paper (<a href="https://arxiv.org/pdf/2501.12948" rel="nofollow">https://arxiv.org/pdf/2501.12948</a>) emphasizes their success with reinforcement learning without requiring any supervised data (unlike RLHF for example). They note that this works well for math and programming questions with verifiable answers.What's totally unclear is what data they used for this reinforcement learning step. How many math problems of the right difficulty with well-defined labeled answers are available on the internet? (I see about 1,000 historical AIME questions, maybe another factor of 10 from other similar contests). Similarly, they mention LeetCode - it looks like there are around 3000 LeetCode questions online. Curious what others think - maybe the reinforcement learning step requires far less data than I would guess?

manojlds3 个月前

>With the advent of the revolutionary Chain-of-Thought ("COT") models introduced in the past year, most noticeably in OpenAI's flagship O1 model (but very recently in DeepSeek's new R1 model, which we will talk about later in much more detail), all that changed. Instead of the amount of inference compute being directly proportional to the length of the output text generated by the model (scaling up for larger context windows, model size, etc.), these new COT models also generate intermediate "logic tokens"; think of this as a sort of scratchpad or "internal monologue" of the model while it's trying to solve your problem or complete its assigned task.Is this right? I thought CoT was a prompting method and are we calling the reasoning models as CoT models?

评论 #42840956 未加载

samiv3 个月前

I think the biggest threat for future NVIDIa right now is their own current success.Their software platforms and CUDA are a very strong moat against everyone else. I don't see any beating them on that front right now.The problem is that I'm afraid that all that money sloshing inside the company is rotting the culture and that will compromise future development.<pre><code> - Grifters are filling out positions in many orgs only trying to milk it as much as possible. - Old employees become complacent with their nice RSU packages Rest & Vest. </code></pre> NVIDIA used to be extremely nimble and was way fighting way above it's weight class. Prior to Mellanox acquisition only around 10k employees and after another 10k more.If there's a real threat to their position at the top of the AI offerings will they be able to roll up the sleeves and get back to work or will the organizations be unable to move ahead.Long term I think it's inevitable that China will take over the technology leadership. They have the population and they have the education programs and the skill to do this. At the same time in the old western democracies things are becoming stagnant and I even dare to say that the younger generations are declining. In my native country the educational system has collapsed, over 20% kids that finish elementary school cannot read or write. They can mouth-breath and scroll TikTok though but just barely since their attention span is about the same as gold fish.

评论 #42844029 未加载

tempeler3 个月前

First of all, I don't invest in Nvidia and like Olygopols. But it is too early to talk about Nvidia's future. People are just betting and wishing about Nvidia's future. No one knows people's what people will do in the future. what they will think? It's just guessing and betting. Their real competitor is not Deepseek. Did AMD or others release something new and compete with Nvidia's products? If NVDIA will be the market leader, this means they will lead the price. Being Olygopol is something like that. They don't need to compete for the price of competitors.

ozten3 个月前

NVIDIA sells shovels to the gold rush. One miner (Liang Wenfeng), who has previously purchased at least 10,000 A100 shovels... has a "side project" where they figured out how to dig really well with a shovel and shared their secrets.The gold rush, wether real or a bubble is still there! NVIDA will still sell every shovel they can manufacture, as soon as it is available in inventory.Fortune 100 companies will still want the biggest toolshed to invent the next paradigm or to be the first to get to AGI.

christkv3 个月前

All this is good news for all of us. Bad news probably for Nvidia's margins long term but who cares. If we can train and inference in less cycles and watts that is awesome.

mrinterweb3 个月前

The vast majority of Nvidia's current value is tied to their dominance in AI hardware. That value could be threatened if LLMs could be trained and or ran efficiently using a CPU or a quantum chip. I don't understand enough about the capabilities of quantum computing to know if running or training a LLM would be possible using a quantum chip, but if it becomes possible, NVDA stock is unlikely to fair well (unless they are making the new chip).

btbuildem3 个月前

I always appreciate reading a take from someone who's well versed in the domains they have opinions about.I think longer-term we'll eat up any slack in efficiency by throwing more inference demands at it -- but the shift is tectonic. It's a cultural thing. People got acclimated to shlepping around morbidly obese node packages and stringing together enormous python libraries - meanwhile the deepseek guys out here carving bits and bytes into bare metal. Back to FP!

评论 #42842769 未加载

p0w3n3d3 个月前

> which require low-latency responses, such as content moderation, fraud detection, dynamic pricing, etc.Is it even legal to give different prices to different customers?

评论 #42842030 未加载

评论 #42841868 未加载

mkalygin3 个月前

This is such a comprehensive analysis, thank you. For someone just starting to learn about the field, it’s a great way to understand what’s going on in the industry.

kimbler3 个月前

Nvidia seem to be one step ahead of this and you can see their platform efforts are pushing towards creating large volumes of compute that are easy to manage for whatever your compute requirements are, be that training, inference or whatever comes next and whatever form. People are maybe tackling some of these areas in isolation but you do not want to build datacenters where everything is ringfenced per task or usage.

macawfish3 个月前

This is exactly where project digits comes in. Nvidia needs to pivot toward being a local inference platform if they want to survive the next shift.

coolThingsFirst3 个月前

As a bystander it's so refreshing to see this, global tech competition is great for the market and it gives hope that LLMs aren't locked behind Bs of investments and smaller players can compete well as well.Exciting times to be living in .

metadat3 个月前

> Another very smart thing they did is to use what is known as a Mixture-of-Experts (MOE) Transformer architecture, but with key innovations around load balancing. As you might know, the size or capacity of an AI model is often measured in terms of the number of parameters the model contains. A parameter is just a number that stores some attribute of the model; either the "weight" or importance a particular artificial neuron has relative to another one, or the importance of a particular token depending on its context (in the "attention mechanism").Has a wide-scale model analysis been performed inspecting the parameters and their weights for all popular open / available models yet? The impact and effects of disclosed inbound data and tuning parameters on individual vector tokens will prove highly informative and clarifying.Such analysis will undoubtedly help semi-literate AI folks level up and bridge any gaps.

111010100011003 个月前

I think this is just a(nother) canary for many other markets in the US v China game of monopoly. One weird effect in all this is that US Tech may go on to be over valued (i.e., disconnect from fundamentals) for quite some time.

nokun73 个月前

Very interesting and it seems like there is more room for optimizations for WASM using SIMD, boosting performance by a lot! It's cool to see how AI can now run even faster on web browsers.

greenie_beans3 个月前

reading this gave me a great idea for <a href="https://bookhead.net" rel="nofollow">https://bookhead.net</a>. thanks!!also thank you for the incredibly informative article.

评论 #42866214 未加载

rashidae3 个月前

While Nvidia’s valuation may feel bloated due to AI hype, AMD might be the smarter play.

qwertox3 个月前

Considering the fact that current models were trained on top-notch books, those read and studied by the most brilliant engineers, the models are pretty dumb.They are more like the thing which enabled computers to work with and digest text instead of just code. The fact that they can parrot pretty interesting relationships from the texts they've consumed kind of proofs that they are capable of statistically "understanding" what we're trying to talk with them about, so it's a pretty good interface.But going back to the really valuable content of the books they've been trained on, they just don't understand it. There's other AI which needs to get created which can really learn the concepts taught in those books instead of just the words and the value of the proximities between them.To learn that other missing part will require hardware just as uniquely powerful and flexible as what Nvidia has to offer. Those companies now optimizing for inference and LLM training will be good at it and have their market share, but they need to ensure that their entire stack is as capable of Nvidia's stack, if they also want to be part of future developments. I don't know if Tenstorrent or Groq are capable of doing this, but I doubt it.

jms553 个月前

Great article, thanks for writing it! Really great summary of the current state of the AI industry for someone like me who's outside of it (but tangential, given that I work with GPUs for graphics).The one thing from the article that sticks out to me is that the author/people are assuming that deepseek needing 1/45th the amount of hardware means that the other 44/45ths large tech companies have invested were wasteful.Does software not scale to meet hardware? I don't see this as 44/45ths wasted hardware, but as a free increase in the amount of hardware people have. Software needing less hardware means you can run even _more_ software without spending more money, not that you need less hardware, right? (for the top-end, non-embedded use cases).---As an aside, the state of the "AI" industry really freaks me out sometimes. Ignoring any sort of short or long term effects on society, jobs, people, etc, just the sheer amount of money and time invested into this one thing is, insane?Tons of custom processing chips, interconnects, compilers, algorithms, _press releases!_, etc all for one specific field. It's like someone taking the last decade of advances in computers, software, etc, and shoving it in the space of a year. For comparison, Rust 1.0 is 10 years old - I vividly remember the release. And even then it took years to propagate out as a "thing" that people were interested in and invested significant time into. Meanwhile deepseek releases a new model (complete with a customer-facing product name and chat interface, instead of something boring and technical), and in 5 days it's being replicated (to at least some degree) and copied by competitors. Google, Apple, Microsoft, etc are all making custom chips and investing insane amounts of money into different compilers, programming languages, hardware, and research.It's just, kind of disquieting? Like everyone involved in AI lives in another world operating at breakneck speed, with billions of dollars involved, and the rest of us are just watching from the sidelines. Most of it (LLMs specifically) is no longer exciting to me. It's like, what's the point of spending time on a non-AI related project? We can spend some time writing a nice API and working on a cool feature or making a UI prettier and that's great, and maybe with a good amount of contributors and solid, sustained effort, we can make a cool project that's useful and people enjoy, and earns money to support people if it's commercial. But then for AI, github repos with shiny well-written readmes pop up overnight, tons of text is being written, thought, effort, and billions of dollars get burned or speculated on in an instant on new things, as soon as the next marketing release is posted.How can the next advancement in graphics, databases, cryptography, etc compete with the sheer amount of societal attention AI receives?Where does that leave writing software for the rest of us?

lenerdenator3 个月前

I think it's more than just the market effect on "established" AI players like Nvidia.I don't think it's necessarily a coincidence that DeepSeek dropped within a short time frame of the announcement of the AI investment initiative by the Trump administration.The idea is to get the money from investors who want to earn a return. Lower capex is attractive to investors, and DS drops capex dramatically. It makes Chinese AI talent look like the smart, safe bet. Nothing like DS could happen in China unless the powers-that-be knew about it and got some level of control. I'm also willing to bet that this isn't the best they've got.They're saying "we can deliver the same capabilities for far less, and we're not going to threaten you with a tariff for not complying".

eprparadox3 个月前

link seems to be dead... is this article still up somewhere?

评论 #42835150 未加载

0n0n0m0uz3 个月前

Please tell me if I am wrong. I know very little details and heard a few headlines and my hasty conclusion is that this development clearly shows the exponential nature of AI development in terms of how people are able to piggyback from the resources, time and money of the previous iteration. They used the output from chatgpt as the input to their model. Is this true, more or less accurate or off base?

scudsworth3 个月前

what a compelling domain name. it compels me not to click on it

zippyman554 个月前

So at some point we will have too many cannon ball polishing factories and it will become apparent the cannon ball trajectory is not easily improved on.

naveen993 个月前

Deepseek iOS app makes TikTok ban pointless.

评论 #42898146 未加载

评论 #42836810 未加载

homarp3 个月前

see also <a href="https://news.ycombinator.com/item?id=42839650">https://news.ycombinator.com/item?id=42839650</a>

robomartin3 个月前

Despite the fact that this article is very well written and certainly contains high quality information, I choose to remain skeptical as it pertains to Nvidia's position in the market. I'll come right out and say that my experience likely makes me see this from a biased position.The premise is simple: Business is warfare. Anything you can do to damage or slow down the market leader gives you more time to get caught up. FUD is a powerful force.My bias comes from having been the subject of such attacks in my prior tech startup. Our technology was destroying the offerings of the market leading multi-billion-dollar global company that pretty much owned the sector. The natural processes of such a beast caused them not to be able to design their way out of a paper bag. We clearly had an advantage. The problem was that we did not have the deep pockets necessary to flood the market with it and take them out.What did they do?The started a FUD campaign.They went to every single large customer and our resellers (this was a hardware/software product) a month or two before the two main industry tradeshows, and lied to them. They promised that they would show market-leading technology "in just a couple of months" and would add comments like "you might want to put your orders on hold until you see this". We had multi-million dollar orders held for months in anticipation of these product unveilings.And, sure enough, they would announce the new products with a great marketing push at the next tradeshow. All demos were engineered and manipulated to deceive, all of them. Yet, the incredible power of throwing millions of dollars at this effort delivered what they needed, FUD.The problem with new products is that it takes months for them to be properly validated. So, if the company that had frozen a $5MM order for our products decides to verify the claims of our competitor, it typically took around four months. In four months, they would discover that the new shiny object was shit and less stellar than what they were told. I other words, we won. Right?No!The mega-corp would then reassure them that they iterated vast improvements into the design and those would be presented --I kid you not-- at the next tradeshow. Spending millions of dollars they, at this point, denied us of millions of dollars of revenue for approximately one year. FUD, again.The next tradeshow came and went and the same cycle repeats...it would take months for customers to realize the emperor had no clothes. It was brutal to be on the receiving end of this without the financial horsepower to be able to break through the FUD. It was a marketing arms race and we were unprepared to win it. In this context, the idea that a better mouse trap always wins is just laughable.This did not end well. They were not going to survive another FUD cycle. Reality eventually comes into play. Except that, in this case, 2008 happened. The economic implosion caught us in serious financial peril due to the damage done by the FUD campaign. Ultimately, it was not survivable and I had to shut down the company.It took this mega-corp another five years to finally deliver a product that approximated what we had and another five years after that to match and exceed it. I don't even want to imagine how many hundreds of millions they spent on this.So, long way of saying: China wants to win. No company in China is independent from government forces. This is, without a doubt, a war for supremacy in the AI world. It is my opinion that, while the technology, as described, seems to make sense, it is highly likely that this is yet another form of a FUD campaign to gain time. If they can deny Nvidia (and others) the orders needed to maintain the current pace, they gain time to execute on a strategy that could give them the advantage.Time will tell.

eigenvalue4 个月前

Yesterday I wrote up all my thoughts on whether NVDA stock is finally a decent short (or at least not a good thing to own at this point). I’m a huge bull when it comes to the power and potential of AI, but there are just too many forces arrayed against them to sustain supernormal profits.Anyway, I hope people here find it interesting to read, and I welcome any debate or discussion about my arguments.

评论 #42835944 未加载

评论 #42835658 未加载

aurareturn3 个月前

<pre><code> Perhaps most devastating is DeepSeek's recent efficiency breakthrough, achieving comparable model performance at approximately 1/45th the compute cost. This suggests the entire industry has been massively over-provisioning compute resources. </code></pre> I wrote in another thread why DeepSeek should increase demand for chips, not lower.1. More efficient LLMs should lead to more usage, which means more AI chip demand. Jevon's Paradox.2. Even if DeepSeek is 45x more efficient (it is not), models will just become 45x+ bigger. It won’t stay small.3. To build a moat, OpenAI and American AI companies need to up their datacenter spending even more.4. DeepSeek's breakthrough is in distilling models. You still need a ton of compute to train the foundational model to distill.5. DeepSeek's conclusion in their paper says more compute is needed for next break through.6. DeepSeek's model is trained on GPT4o/Sonnet outputs. Again, this reaffirms the fact that in order to take the next step, you need to continue to train better models. Better models will generate better data for next-gen models.I think DeepSeek hurts OpenAI/Anthropic/Google/Microsoft. I think DeepSeek helps TSMC/Nvidia.<pre><code> Combined with the emergence of more efficient inference architectures through chain-of-thought models, the aggregate demand for compute could be significantly lower than current projections assume. </code></pre> This is misguided. Let's think logically about this.More thinking = smarter modelsFaster hardware = more thinkingMore/newer Nvidia GPUs, better TSMC nodes = faster hardwareTherefore, you can conclude that Nvidia and TSMC demand should go up because of CoT models. In 2025, CoT models are clearly bottlenecked by not having enough compute.<pre><code> The economics here are compelling: when DeepSeek can match GPT-4 level performance while charging 95% less for API calls, it suggests either NVIDIA's customers are burning cash unnecessarily or margins must come down dramatically. </code></pre> Or that in order to build a moat, OpenAI/Anthropic/Google and other laps need to double down on even more compute.

评论 #42842403 未加载

评论 #42842380 未加载

评论 #42842516 未加载

diesel43 个月前

Link isn't working. Is there another or a cached version?

评论 #42833279 未加载

lauriewired3 个月前

Does no one realize this is a thinly-veiled ad? The URL is bizarre

评论 #42848209 未加载

OutOfHere3 个月前

It seems like a pointless discussion since DeepSeek uses Nvidia GPUs after all.

评论 #42834093 未加载

评论 #42835479 未加载

Giorgi3 个月前

Looks like huge astroturfing effort from CCP. I am seeing these coordinated propaganda inside every AI related sub on reddit, on social media and now - here.

评论 #42840469 未加载

miraculixx3 个月前

If we are to get to AGI why do we need to train on all data? That's silly, and all we get is compression and probabliatic retrieval.Intelligence by definition is not compression, but ability to think and act according to new data, based on experience.Trully AGI models will work on the this principle, not on best compression of as much data as possible.We need a new approach.

评论 #42840306 未加载

评论 #42842912 未加载

评论 #42841996 未加载

68 条评论

dang3 个月前

Related ongoing thread:Nvidia’s $589B DeepSeek rout - <a href="https://news.ycombinator.com/item?id=42839650">https://news.ycombinator.com/item?id=42839650</a> - Jan 2025 (574 comments)

pjdesno3 个月前

评论 #42842771 未加载

评论 #42842743 未加载

评论 #42843892 未加载

评论 #42843876 未加载

评论 #42845901 未加载

评论 #42842602 未加载

评论 #42849069 未加载

评论 #42845896 未加载

评论 #42844193 未加载

评论 #42845302 未加载

breadwinner3 个月前

评论 #42835819 未加载

评论 #42838743 未加载

评论 #42835938 未加载

评论 #42835894 未加载

评论 #42840082 未加载

评论 #42894987 未加载

评论 #42838294 未加载

评论 #42838417 未加载

评论 #42843487 未加载

评论 #42843178 未加载

评论 #42835975 未加载

评论 #42840838 未加载

评论 #42843877 未加载

评论 #42835983 未加载

评论 #42840582 未加载

评论 #42870660 未加载

fairity3 个月前

评论 #42843243 未加载

评论 #42842592 未加载

评论 #42842674 未加载

评论 #42843444 未加载

评论 #42843828 未加载

UncleOxidant3 个月前

评论 #42844641 未加载

评论 #42844780 未加载

评论 #42846543 未加载

评论 #42844943 未加载

评论 #42844850 未加载

评论 #42845106 未加载

colinnordin3 个月前

评论 #42841641 未加载

评论 #42842767 未加载

评论 #42843454 未加载

评论 #42842228 未加载

评论 #42841839 未加载

simonw3 个月前

评论 #42836800 未加载

andrewgross3 个月前

评论 #42836829 未加载

j7ake3 个月前

评论 #42835346 未加载

评论 #42834190 未加载

评论 #42832961 未加载

lxgr3 个月前

Man, do I love myself a deep, well-researched long-form contrarian analysis published as a tangent of an already niche blog on a Sunday evening! The old web isn't dead yet :)

评论 #42836889 未加载

liuliu3 个月前

评论 #42844711 未加载

hn_throwaway_993 个月前

评论 #42845732 未加载

评论 #42845931 未加载

评论 #42844507 未加载

brandonpelfrey3 个月前

评论 #42841665 未加载

评论 #42841898 未加载

评论 #42841891 未加载

评论 #42843912 未加载

评论 #42845651 未加载

评论 #42843072 未加载

skizm3 个月前

评论 #42842681 未加载

评论 #42842587 未加载

snowmaker3 个月前

This is an excellent article, basically a patio11 / matt levine level breakdown of what's happening with the GPU market.

评论 #42836596 未加载

naiv3 个月前

评论 #42839588 未加载

评论 #42840051 未加载

typeofhuman3 个月前

I'm rooting for DeepSeek (or any competitor) against OpenAI because I don't like Sam Altman. I'm confident in admitting it.

评论 #42842000 未加载

评论 #42847023 未加载

pavelstoev3 个月前

chvid3 个月前

mgraczyk3 个月前

评论 #42838301 未加载

评论 #42838223 未加载

breadwinner3 个月前

评论 #42837819 未加载

评论 #42839428 未加载

评论 #42840820 未加载

lxgr3 个月前

评论 #42844394 未加载

评论 #42844262 未加载

评论 #42844461 未加载

eigenvalue3 个月前

indymike3 个月前

评论 #42843976 未加载

评论 #42848238 未加载

arcanus3 个月前

评论 #42836304 未加载

wtcactus3 个月前

jwan5843 个月前

The point about using FP32 for training is wrong. Mixed precision (FP16 multiplies, FP32 accumulates) has been use for years – the original paper came out in 2017.

评论 #42840250 未加载

dartos3 个月前

This just in.Competition lowers the value of monopolies.

gnlrtntv3 个月前

评论 #42835689 未加载

uncletaco3 个月前

When he says better linux drivers than AMD he's strictly talking about for AI, right? Because for video the opposite has been the case for as far back as I can remember.

评论 #42840270 未加载

suraci3 个月前

plaidfuji3 个月前

mackid3 个月前

highfrequency3 个月前

manojlds3 个月前

评论 #42840956 未加载

samiv3 个月前

评论 #42844029 未加载

tempeler3 个月前

ozten3 个月前

christkv3 个月前

All this is good news for all of us. Bad news probably for Nvidia's margins long term but who cares. If we can train and inference in less cycles and watts that is awesome.

mrinterweb3 个月前

btbuildem3 个月前

评论 #42842769 未加载

p0w3n3d3 个月前

> which require low-latency responses, such as content moderation, fraud detection, dynamic pricing, etc.Is it even legal to give different prices to different customers?

评论 #42842030 未加载

评论 #42841868 未加载

mkalygin3 个月前

This is such a comprehensive analysis, thank you. For someone just starting to learn about the field, it’s a great way to understand what’s going on in the industry.

kimbler3 个月前

macawfish3 个月前

This is exactly where project digits comes in. Nvidia needs to pivot toward being a local inference platform if they want to survive the next shift.

coolThingsFirst3 个月前

metadat3 个月前

111010100011003 个月前

nokun73 个月前

Very interesting and it seems like there is more room for optimizations for WASM using SIMD, boosting performance by a lot! It's cool to see how AI can now run even faster on web browsers.

greenie_beans3 个月前

reading this gave me a great idea for <a href="https://bookhead.net" rel="nofollow">https://bookhead.net</a>. thanks!!also thank you for the incredibly informative article.

评论 #42866214 未加载

rashidae3 个月前

While Nvidia’s valuation may feel bloated due to AI hype, AMD might be the smarter play.

qwertox3 个月前

jms553 个月前

lenerdenator3 个月前

eprparadox3 个月前

link seems to be dead... is this article still up somewhere?

评论 #42835150 未加载

0n0n0m0uz3 个月前

scudsworth3 个月前

what a compelling domain name. it compels me not to click on it

zippyman554 个月前

So at some point we will have too many cannon ball polishing factories and it will become apparent the cannon ball trajectory is not easily improved on.

naveen993 个月前

Deepseek iOS app makes TikTok ban pointless.

评论 #42898146 未加载

评论 #42836810 未加载

homarp3 个月前

see also <a href="https://news.ycombinator.com/item?id=42839650">https://news.ycombinator.com/item?id=42839650</a>

robomartin3 个月前

eigenvalue4 个月前

评论 #42835944 未加载

评论 #42835658 未加载

aurareturn3 个月前

评论 #42842403 未加载

评论 #42842380 未加载

评论 #42842516 未加载

diesel43 个月前

Link isn't working. Is there another or a cached version?

评论 #42833279 未加载

lauriewired3 个月前

Does no one realize this is a thinly-veiled ad? The URL is bizarre

评论 #42848209 未加载

OutOfHere3 个月前

It seems like a pointless discussion since DeepSeek uses Nvidia GPUs after all.

评论 #42834093 未加载

评论 #42835479 未加载

Giorgi3 个月前

Looks like huge astroturfing effort from CCP. I am seeing these coordinated propaganda inside every AI related sub on reddit, on social media and now - here.