Explainer: What's r1 and everything else?

262 点作者 Philpax4 个月前

14 条评论

Someone should author an ELI5 (or slightly older) guide to how LLMs, RL, Agents, CoT, etc, all work and what all these acronyms mean. And then, add to it, daily or weekly, as new development arise. I don't want to keep reading dozens of articles, white papers, tweets, etc, as new developments happen. I want to go back to the same knowledge base, that's authored by the same person (or people), that maintains a consistent reading and comprehension level, and builds on prior points.It seems like the AI space is moving impossibly fast, and its just ridiculously hard to keep up unless 1) you work in this space, 2) are very comfortable with the technology behind it, so you can jump in at any point and understand it.

评论 #42837117 未加载

评论 #42836683 未加载

Havoc4 个月前

>people re-creating R1 (some claim for $30)R1 or the R1 finetunes? Not the same thing...HF is busy recreating R1 itself but that seems to be a pretty big endevour not a $30 thing

评论 #42830084 未加载

评论 #42831530 未加载

rahimnathwani4 个月前

<pre><code> Most important, R1 shut down some very complex ideas (like DPO & MCTS) and showed that the path forward is simple, basic RL. </code></pre> This isn't quite true. R1 used a mix of RL and supervised fine-tuning. The data used for supervised fine-tuning may have been model-generated, but the paper implies it was human-curated: they kept only the 'correct' answers.

评论 #42838860 未加载

fullstackchris4 个月前

So the conclusion is AI is about to "increase in abilities at an exponential rate", with the only data point being that R1 was sucessfully able to acheive o1 levels as an open source model? In other words, two extremely unrelated themes?Does this guy know people were writing verbatim the same thing in like... 2021? Still always incredible to me the same repeated hype over and over rise to the surface. Oh well... old man gonna old man

评论 #42838194 未加载

raincole4 个月前

People keep saying that DeepSeek R1's training cost is just $5.6M. Where is the source?I'm not asking for the proof. Just the source, even a self-claimed statement. I've read the R1's paper and it doesn't say the number of $5.6M. Is it somewhere in DeepSeek's press release?

whimsicalism4 个月前

this is a pretty hype-laden/twitter-laden article, i would not trust it to explain things to you

评论 #42833938 未加载

comrade12344 个月前

The benchmarks for the different models focus on math and coding accuracy. I have a use-case for a model where those two functions are completely irrelevant and I’m only interested in writing (chat, stories, etc). I guess you can’t really benchmark ‘concepts’ as easily as logic.With distillation, can a model be made that strips out most of the math and coding stuff?

评论 #42830159 未加载

评论 #42828375 未加载

11235813214 个月前

Nice explainer. R1 hit sensational mainstream news which has resulted in some confusion and alarm among family and friends. It’s hard to succinctly explain this doesn’t mean China is destroying us, that Americans immediately started working with the breakthrough, cost optimization is inevitable in computing, etc.

richardatlarge4 个月前

T or F?Nobody really saw the LLM leap comingNobody really saw R1 comingWe don’t know what’s coming

bikamonki4 个月前

So, is AI already reasoning or not?

评论 #42831544 未加载

评论 #42857541 未加载

评论 #42838089 未加载

评论 #42831374 未加载

simonw4 个月前

From that article:> ARC-AGI is a benchmark that’s designed to be simple for humans but excruciatingly difficult for AI. In other words, when AI crushes this benchmark, it’s able to do what humans do.That's a misunderstanding of what ARC-AGI means. Here's what ARC-AGI creator François Chollet has to say: <a href="https://bsky.app/profile/fchollet.bsky.social/post/3les3izgdj22j" rel="nofollow">https://bsky.app/profile/fchollet.bsky.social/post/3les3izgd...</a>> I don't think people really appreciate how simple ARC-AGI-1 was, and what solving it really means.> It was designed as the simplest, most basic assessment of fluid intelligence possible. Failure to pass signifies a near-total inability to adapt or problem-solve in unfamiliar situations.> Passing it means your system exhibits non-zero fluid intelligence -- you're finally looking at something that isn't pure memorized skill. But it says rather little about how intelligent your system is, or how close to human intelligence it is.

评论 #42830841 未加载

评论 #42830615 未加载

评论 #42830675 未加载

polotics4 个月前

Interesting article, but the flourish ending """AI will soon (if not already) increase in abilities at an exponential rate.""" is not at all substantiated. Would be nice to know how the author gets to that conclusion.

评论 #42829777 未加载

评论 #42829706 未加载

评论 #42829665 未加载

评论 #42830698 未加载

评论 #42832224 未加载

评论 #42833383 未加载

评论 #42830634 未加载

评论 #42832379 未加载

huqedato4 个月前

I know that I'll get a lot of hate and downvotes for this comment.I want to say that I have all the respect and admiration for these Chinese people, their ingenuity and their way of doing innovation even if they achieve this through technological theft and circumventing embargoes imposed by US (we all know how GPUs find their way into their hands).We are living a time with a multi-faceted war between the US, China, EU, Russia and others. One of the battlegrounds is AI supremacy. This war (as any war) isn’t about ethics; it’s about survival, and anything goes.Finally, as someone from Europe, I confess that here is well known that the "US innovates while EU regulates" and that's a shame IMO. I have the impression that EU is doing everything possible to keep us, European citizens, behind, just mere spectators in this tech war. We are already irrelevant, niche players.

评论 #42830445 未加载

评论 #42857658 未加载

评论 #42831056 未加载

评论 #42831432 未加载

评论 #42830259 未加载

评论 #42831902 未加载

justlikereddit4 个月前

Short version: It's Hype.Long version: It's marketing efforts stirring up hype around incremental software updates. If this was software being patched in 2005 we'd call it "ChatGPT V1.115">Patch notes: >Added bells. >Added whistles.