Some time ago I interviewed with an AI governance think tank, and one of the topics that came up for ensuring Western-allied AI supremacy was trade security of compute resources. I didn’t want to shoot down their approach on this to their faces because they had a lot of blogs on their site, with OpenAI co-authors, about the efficacy of this approach. But I was never sold on the idea that holding off compute resources would stop someone from researching into better AI, I thought it would just push them to make more efficient AI. Lo and behold.
The article starts off on a pretty interesting premise: "I predicted there'd be no major advancements, but also, here are 4 major advancements in the last 4 weeks, many of which are predicated on another massive advancement that came out last year" (a SOTA model leveraging test-time compute)<p>We're also not seeing any price wars yet: every drop in price has been predicated on models getting faster, and while we don't know the size of models behind the scene, we can pretty reliably infer that they haven't been dropping their margins over time (at least, not at scale)<p>Anthropic even <i>raised</i> prices on their smallest model vs 2023 right as strong competition emerged.<p>Also seems strange to make a big deal about there being no moats... then imply that Meta deserves to be called up by Congress for releasing Llama's weights.
The only losers in this AI race are probably humanity and overvalued stock markets. Kind of hard to price in years of future value and then to see some unknown Chinese AIs catch the big boys with their pants down.<p>I really can't over emphasize the "humanity losing" point, though.
> Two years ago, they were on top of the world, having just introduced ChatGPT, and struck a big deal with Microsoft. Nobody else had a model close to GPT-4 level; media coverage of OpenAI was endless; customer adoption was swift. They could charge more or less what they wanted, with the whole world curious and no other provider. Sam Altman was almost universally adored. People imagined nearly infinite revenue and enormous profits.<p>Part of me likes to think of this as cosmic karma. I know it's been hounded on a lot lately, but the irony is too rich, especially with how hostile they've been to "Open AI."
Isn’t Deepseek’s advantage that they didn’t actually start from scratch and that embeddings & training was already supplied? To say that Deepseek is comparable in performance to OpenAI is like saying Kirkland brands is comparable to {insert non white labeled good here}- they’re created off the same inputs. To say that Deepseek is a threat to AI supremacy is hyperbolic. As long as OpenAI innovates at the rate that it’s been innovating, their value is undeniable. Sure, Deepseek may tick after OpenAI’s tock, but the premium is in that tock.
Since we’re (as in the human race) going to go ahead and push for AGI with reckless abandon, the best thing that can happen is for it to be available to everyone, no moats. I can’t help but feel a bit of schadenfreude given that this was supposed to be the very principle that OpenAI was founded on and that they abandoned at the first whiff of money. Now, it looks like it’s going to happen whether they want it to or not.
If it only cost $5.5 million for DeepSeek to create R1, what’s stopping OpenAI from building on their open research to create something even better for $500 million?<p>Everyone keeps talking about diminishing returns in training and plateaus in reasoning ability but it seems to me that’s exactly what R1 demonstrates with a 100x reduction in training costs: there is a <i>long</i> way to go and loads of low hanging fruit.
Question: Suppose you wanted to train an LLM using say 100 GPUs, but for some reason you could not obtain the GPUs. You could however obtain CPUs. Each CPU lacks the massive parallelism of each GPU, so might have a throughout of say 1/100th the TFLOPS of the GPU.<p>Could you use 10000 CPUs and get a similar level of performance that 100 GPUs would have given?
The US may ban export of GPUs to China, but a GPU is not enough to build a training rig. You need memories, and hundred of smaller components. China has the ability to produce every component which goes in a computer. Can the US do the same?
There's still many more ideas to test and fuel to burn before declaring any outcome in the AI wars.<p>For more substantial content, check out:<p><i>The impact of competition and DeepSeek on Nvidia</i><p><a href="https://youtubetranscriptoptimizer.com/blog/05_the_short_case_for_nvda" rel="nofollow">https://youtubetranscriptoptimizer.com/blog/05_the_short_cas...</a><p><a href="https://news.ycombinator.com/item?id=42822162">https://news.ycombinator.com/item?id=42822162</a> (70 comments)
Come on now. That's a super clickbait title. The article actually says the race is at a dead heat and LLM training for the same competency is getting cheaper, not that it's "over".