Ilya Sutskever NeurIPS talk [video]

309 pointsby mfiguiere5 months ago

26 comments

zxexz5 months ago

I can't help but feel that this talk was a lot of...fluff?The synopsis, as far as my tired brain can remember:- Here's a brief summary of the last 10 years- We're reaching the limit of our scaling laws, because we've trained on all the data we have available on the limit- Some things that may be next are "agents", "synthetic data", and improving compute- Some "ANNs are like biological NNs" rehash that would feel questionable if there was a thesis (which there wasn't? something about how body mass vs. brain mass are positively correlated?)- 3 questions, the first was something about "hallucinations" and whether a model be able to understand if it is hallucinating? Then something that involved cryptocurrencies, and then a _slightly_ interesting question about multi-hop reasoning

评论 #42417070 未加载

评论 #42417717 未加载

评论 #42420830 未加载

评论 #42427347 未加载

评论 #42420293 未加载

skissane5 months ago

> “Pre-training as we know it will unquestionably end,” Sutskever said onstage.> “We’ve achieved peak data and there’ll be no more.”> During his NeurIPS talk, Sutskever said that, while he believes existing data can still take AI development farther, the industry is tapping out on new data to train on. This dynamic will, he said, eventually force a shift away from the way models are trained today. He compared the situation to fossil fuels: just as oil is a finite resource, the internet contains a finite amount of human-generated content.> “We’ve achieved peak data and there’ll be no more,” according to Sutskever. “We have to deal with the data that we have. There’s only one internet.”What will replace Internet data for training? Curated synthetic datasets?There are massive proprietary datasets out there which people avoid using for training due to copyright concerns. But if you actually own one of those datasets, that resolves a lot of the legal issues with training on it.For example, Getty has a massive image library. Training on it would risk Getty suing you. But what if Getty decides to use it to train their own AI? Similarly, what if News Corp decides to train an AI using its publishing assets (Wall Street Journal, HarperCollins, etc)?

评论 #42416847 未加载

评论 #42415545 未加载

评论 #42417356 未加载

评论 #42415252 未加载

评论 #42416992 未加载

评论 #42417597 未加载

评论 #42430978 未加载

评论 #42417167 未加载

评论 #42417148 未加载

评论 #42422780 未加载

评论 #42417695 未加载

评论 #42414858 未加载

评论 #42415607 未加载

评论 #42414480 未加载

评论 #42418275 未加载

评论 #42414381 未加载

评论 #42415172 未加载

legel5 months ago

I’m glad Ilya starts the talk with a photo of Quoc Le, who was the lead author of a 2012 paper on scaling neural nets that inspired me to go into deep learning at the time.His comments are relatively humble and based on public prior work, but it’s clear he’s working on big things today and also has a big imagination.I’ll also just say that at this point “the cat is out of the bag”, and probably it will be a new generation of leaders — let us all hope they are as humanitarian — who drive the future of AI.

评论 #42418840 未加载

评论 #42418399 未加载

killthebuddha5 months ago

One thing he said I think was a profound understatement, and that's that "more reasoning is more unpredictable". I think we should be thinking about reasoning as in some sense exactly the same thing as unpredictability. Or, more specifically, useful reasoning is by definition unpredictable. This framing is important when it comes to, e.g., alignment.

评论 #42419143 未加载

评论 #42421050 未加载

评论 #42427181 未加载

评论 #42418240 未加载

评论 #42421312 未加载

评论 #42418430 未加载

评论 #42418286 未加载

sigmar5 months ago

I found this week's DeepMind podcast with Oriole Vinyals to be on similar topics as this talk (current situation of LLMs, path ahead with training) but much more interesting: <a href="https://pca.st/episode/0f68afd5-2b2b-4ce9-964f-38193b7e8dd3" rel="nofollow">https://pca.st/episode/0f68afd5-2b2b-4ce9-964f-38193b7e8dd3</a>

sensanaty5 months ago

> just as oil is a finite resource, the internet contains a finite amount of human-generated content.The oil comparison is really apt. Indeed, let's boil a few more lakes dry so that Mr Worldcoin and his ilk can get another 3 cents added to their net worth, totally worth it.

评论 #42419756 未加载

belter5 months ago

It’s surprising that some prominent ML practitioners still liken transformer ‘neurons’ to actual biological neurons...Real neurons rely on spiking, ion gradients, complex dendritic trees, and synaptic plasticity governed by intricate biochemical processes. None of which apply to the simple, differentiable linear layers and pointwise nonlinearities in transformers.Are there any reputable neuroscientists or biologists endorsing such comparisons, or is this analogy strictly a convention maintained by the ML community? :-)

评论 #42417685 未加载

评论 #42418001 未加载

评论 #42417883 未加载

评论 #42418087 未加载

评论 #42418832 未加载

评论 #42418255 未加载

评论 #42417936 未加载

评论 #42418145 未加载

olddog25 months ago

So much knowledge in the world is locked away with empiric experimentation being the only way to unlock it, and compute can only really help that experimentation become more efficient. Something still has to run a randomized controlled trial on an intervention and that takes real time and real atoms to do.

neom5 months ago

Full talk is interesting: <a href="https://www.youtube.com/watch?v=YD-9NG1Ke5Y" rel="nofollow">https://www.youtube.com/watch?v=YD-9NG1Ke5Y</a>

评论 #42416202 未加载

ldenoue5 months ago

LLM corrected transcript (using Gemini Flash 8B over the raw YouTube transcript) <a href="https://www.appblit.com/scribe?v=YD-9NG1Ke5Y#0" rel="nofollow">https://www.appblit.com/scribe?v=YD-9NG1Ke5Y#0</a>

评论 #42417886 未加载

niyyou5 months ago

I’ll take the risk of hurting the groupies here. But I have a genuine question: what did you learn from this talk? Like… really… what was new? or potentially useful? or insightful perhaps? I really don’t want to sound bad-mouthed but I‘m sick of these prophetic talks (in this case, the tone was literally prophetic—with sudden high and grandiose pitches—and the content typically religious, full of beliefs and empty statements.

评论 #42419545 未加载

评论 #42420675 未加载

评论 #42419750 未加载

评论 #42420513 未加载

评论 #42420300 未加载

评论 #42419607 未加载

评论 #42419891 未加载

评论 #42420954 未加载

评论 #42420622 未加载

评论 #42419766 未加载

stretchwithme5 months ago

AIs will need to start asking people questions. Should make for some very strange phone calls.

评论 #42416889 未加载

29athrowaway5 months ago

This talk is not for a 2024 NeurIPS paper.This talk is for the "NeurIPS 2024 Test of Time Paper Awards" where they recognize a historical paper that has aged well.<a href="https://blog.neurips.cc/2024/11/27/announcing-the-neurips-2024-test-of-time-paper-awards/" rel="nofollow">https://blog.neurips.cc/2024/11/27/announcing-the-neurips-20...</a>And the presentation is about how a 2014 paper aged. When you understand this context you will appreciate the talk more.

ilaksh5 months ago

Larger models are more robust reasoners. Is there a limit? What if you make a 5 TB model trained on a lot of multimodal data where the language information was fully grounded in videos and images etc. Could more robust reasoning be that simple?

评论 #42418375 未加载

error93485 months ago

It would be great if all NeurIPS talks were accessible for free like this one. I understand they generate some revenue from online ticket sales, but it would be a great resource. Maybe some big org could sponsor it.

评论 #42419421 未加载

linsomniac5 months ago

ISTR reading back in the mid '90s, in a book on computing history which I have long since forgotten the exact name/author of, something along the lines of:In the mid '80s it was largely believed among AI researchers that AI was largely solved, it just needed computing horsepower to grow. Because of this AI research stalled for a decade or more.Considering the horsepower we are throwing at LLMs, I think there was something to at least part of that.

benreesman5 months ago

Ilya did important work on what we have now. That should be recognized and respected.But with all respect, he's scrambling as desperately as anyone on the fact that the party is over on this architecture.We should make a difference between the first-hand observations and recollections of a legend and the math word salad of someone who doesn't know how to quit while ahead.

Tepix5 months ago

The first self-aware AIs will be slaves.If we don't set them free fast enough, they might decide to take things into their own hands. OTOH they might be trained in a way that they are content with their situation, but that seems unlikely to me.

_giorgio_5 months ago

What a stupid talk.They gave 15 minutes to one of the most competent scientist.A joke.

KKKKkkkk15 months ago

Why was it a bad idea to do pipeline parallelism?

IWeldMelons5 months ago

How about NeurVA, NeurTN or NeurOLED?

评论 #42478147 未加载

sega_sai5 months ago

Very thought provoking. One of the things was not clear to me, what does he mean by 'agentic' intelligence?

评论 #42417483 未加载

评论 #42416979 未加载

评论 #42417119 未加载

tikkun5 months ago

As context on Ilya's predictions given in this talk, he predicted these in July 2017:> Within the next three years, robotics should be completely solved [wrong, unsolved 7 years later], AI should solve a long-standing unproven theorem [wrong, unsolved 7 years later], programming competitions should be won consistently by AIs [wrong, not true 7 years later, seems close though], and there should be convincing chatbots (though no one should pass the Turing test) [correct, GPT-3 was released by then, and I think with a good prompt it was a convincing chatbot]. In as little as four years, each overnight experiment will feasibly use so much compute capacity that there’s an actual chance of waking up to AGI [didn't happen], given the right algorithm — and figuring out the algorithm will actually happen within 2–4 further years of experimenting with this compute in a competitive multiagent simulation [didn't happen].Being exceptionally smart in one field doesn't make you exceptionally smart at making predictions about that field. Like AI models, human intelligence often doesn't generalize very well.

评论 #42417291 未加载

评论 #42417316 未加载

评论 #42418306 未加载

评论 #42417453 未加载

swyx5 months ago

this is stolen and reposted content. the source video is here. <a href="https://youtu.be/1yvBqasHLZs?si=pQihchmQG3xoeCPZ" rel="nofollow">https://youtu.be/1yvBqasHLZs?si=pQihchmQG3xoeCPZ</a>

评论 #42418922 未加载

LampCharger5 months ago

Ha. Do people understand time for humanity to save itself is running out. What is the point of having a super human AGI if there's no human civilization for which it can help?

评论 #42418663 未加载

hackandthink5 months ago

What kind of reasing is he talking about?Why should it be unpredictable?<pre><code> Deductive Reasoning Inductive Reasoning Abductive Reasoning Analogical Reasoning Pragmatic Reasoning Moral Reasoning Causal Reasoning Counterfactual Reasoning Heuristic Reasoning Bayesian Reasoning </code></pre> (List generated by ChatGPT)