TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What were the papers on the list Ilya Sutskever gave John Carmack?

396 点作者 alan-stark超过 2 年前
John Carmack&#x27;s new interview on AI&#x2F;AGI [1] carries a puzzle:<p>“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out in my head.”<p>What papers do you think were on this list?<p>[1] https:&#x2F;&#x2F;dallasinnovates.com&#x2F;exclusive-qa-john-carmacks-different-path-to-artificial-general-intelligence&#x2F;

26 条评论

dang超过 2 年前
Recent and related:<p><i>John Carmack’s ‘Different Path’ to Artificial General Intelligence</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34637650" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34637650</a> - Feb 2023 (402 comments)
sillysaurusx超过 2 年前
&quot;The email including them got lost to Meta&#x27;s two-year auto-delete policy by the time I went back to look for it last year. I have a binder with a lot of them printed out, but not all of them.&quot;<p>RIP. If it&#x27;s any consolation, it sounds like the list is at least three years old by now. Which is a long time considering that 2016 is generally regarded as the date of the deep learning revolution.
评论 #34647577 未加载
评论 #34646934 未加载
评论 #34682543 未加载
评论 #34655392 未加载
querez超过 2 年前
A lot of other posts here are biased to recent papers, and papers that had &quot;a big impact&quot;, but miss a lot of foundations. I think this reddit post on the most foundational ML papers gives a lot more balanced overview: <a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;zetvmd&#x2F;d_if_you_had_to_pick_1020_significant_papers_that&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;zetvmd&#x2F;d_i...</a>
sho_hn超过 2 年前
&gt; &quot;You’ll find people who can wax rhapsodic about the singularity and how everything is going to change with AGI. But if I just look at it and say, if 10 years from now, we have ‘universal remote employees’ that are artificial general intelligences, run on clouds, and people can just dial up and say, ‘I want five Franks today and 10 Amys, and we’re going to deploy them on these jobs,’ and you could just spin up like you can cloud-access computing resources, if you could cloud-access essentially artificial human resources for things like that—that’s the most prosaic, mundane, most banal use of something like this.&quot;<p>So, slavery?
评论 #34647012 未加载
评论 #34647430 未加载
评论 #34647140 未加载
评论 #34647020 未加载
评论 #34649449 未加载
optimalsolver超过 2 年前
Carmack says he&#x27;s pursuing a different path to AGI, then goes straight to the guy at the center of the most saturated area of machine learning (deep learning)?<p>I would&#x27;ve hoped he&#x27;d be exploring weirder alternatives off the beaten path. I mean, neural networks might not even be necessary for AGI, but no one at OpenAI is going to tell Carmack that.
评论 #34647015 未加载
评论 #34643572 未加载
评论 #34646070 未加载
评论 #34646218 未加载
评论 #34646776 未加载
评论 #34650068 未加载
评论 #34647086 未加载
评论 #34646824 未加载
chrgy超过 2 年前
From ChatGPT, although personally I think this list is bit old but should be at the 60% mark at the very least Deep Learning:<p>AlexNet (2012) VGGNet (2014) ResNet (2015) GoogleNet (2015) Transformer (2017) Reinforcement Learning:<p>Q-Learning (Watkins &amp; Dayan, 1992) SARSA (R. S. Sutton &amp; Barto, 1998) DQN (Mnih et al., 2013) A3C (Mnih et al., 2016) PPO (Schulman et al., 2017) Natural Language Processing:<p>Word2Vec (Mikolov et al., 2013) GLUE (Wang et al., 2018) ELMo (Peters et al., 2018) GPT (Radford et al., 2018) BERT (Devlin et al., 2019)
评论 #34652467 未加载
ilaksh超过 2 年前
My guess is that multimodal transformers will probably eventually get us most of the way there for general purpose AI.<p>But AGI is one of those very ambiguous terms. For many people it&#x27;s either an exact digital replica of human behavior that is alive, or something like a God. I think it should also apply to general purpose AI that can do most human tasks in a strictly guided way, although not have other characteristics of humans or animals. For that I think it can be built on advanced multimodal transformer-based architectures.<p>For the other stuff, it&#x27;s worth giving a passing glance to the fairly extensive amount of research that has been labeled AGI over the last decade or so. It&#x27;s not really mainstream except maybe the last couple of years because really forward looking people tend to be marginalized including in academia.<p><a href="https:&#x2F;&#x2F;agi-conf.org" rel="nofollow">https:&#x2F;&#x2F;agi-conf.org</a><p>Looking forward, my expectation is that things like memristors or other compute-in-memory will become very popular within say 2-5 years (obviously total speculation since there are no products yet that I know of) and they will be vastly more efficient and powerful especially for AI. And there will be algorithms for general purpose AI possibly inspired by transformers or AGI research but tailored to the new particular compute-in-memory systems.
评论 #34645316 未加载
评论 #34647196 未加载
jimmySixDOF超过 2 年前
&gt;90% of what matters today<p>Strikes me as the kind of thing where that last 10% will need 400 papers
评论 #34643248 未加载
评论 #34643768 未加载
评论 #34643672 未加载
评论 #34646682 未加载
codeviking超过 2 年前
This inspired us to do a little exploration. We used the top cited papers of a few authors to produce a list that might be interesting, and to do some additional analysis. Take a look: <a href="https:&#x2F;&#x2F;github.com&#x2F;allenai&#x2F;author-explorer">https:&#x2F;&#x2F;github.com&#x2F;allenai&#x2F;author-explorer</a>
hexhowells超过 2 年前
While not all papers, this list contains a lot of important papers, writings, and conversations currently in AI: <a href="https:&#x2F;&#x2F;docs.google.com&#x2F;document&#x2F;d&#x2F;1bEQM1W-1fzSVWNbS4ne5PopB2b7j8zD4Jc3nm4rbK-U&#x2F;edit" rel="nofollow">https:&#x2F;&#x2F;docs.google.com&#x2F;document&#x2F;d&#x2F;1bEQM1W-1fzSVWNbS4ne5PopB...</a>
评论 #34660212 未加载
albertzeyer超过 2 年前
(Partly copied from <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34640251" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34640251</a>.)<p>On models: Obviously, almost everything is Transformer nowadays (Attention is all you need paper). However, I think to get into the field, to get a good overview, you should also look a bit beyond the Transformer. E.g. RNNs&#x2F;LSTMs are still a must learn, even though Transformers might be better in many tasks. And then all those memory-augmented models, e.g. Neural Turing Machine and follow-ups, are important too.<p>It also helps to know different architectures, such as just language models (GPT), attention-based encoder-decoder (e.g. original Transformer), but then also CTC, hybrid HMM-NN, transducers (RNN-T).<p>Some self-promotion: I think my Phd thesis does a good job on giving an overview on this: <a href="https:&#x2F;&#x2F;www-i6.informatik.rwth-aachen.de&#x2F;publications&#x2F;download&#x2F;1223&#x2F;Zeyer--2022.pdf" rel="nofollow">https:&#x2F;&#x2F;www-i6.informatik.rwth-aachen.de&#x2F;publications&#x2F;downlo...</a><p>Diffusion models is also another recent different kind of model.<p>Then, a separate topic is the training aspect. Most papers do supervised training, using cross entropy loss to the ground-truth target. However, there are many others:<p>There is CLIP to combine text and image modalities.<p>There is the whole field on unsupervised or self-supervised training methods. Language model training (next label prediction) is one example, but there are others.<p>And then there is the big field on reinforcement learning, which is probably also quite relevant for AGI.
评论 #34642494 未加载
评论 #34645399 未加载
评论 #34646581 未加载
klaussilveira超过 2 年前
Following: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;u3dcommunity&#x2F;status&#x2F;1621524851898089478?s=20" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;u3dcommunity&#x2F;status&#x2F;1621524851898089478?...</a>
评论 #34643522 未加载
polskibus超过 2 年前
What about just asking Carmack on twitter?
评论 #34643865 未加载
评论 #34645611 未加载
KRAKRISMOTT超过 2 年前
Start tweeting at him until he shares
评论 #34642659 未加载
EvgeniyZh超过 2 年前
Attention, scaling laws, diffusion, vision transformers, Bert&#x2F;Roberta, CLIP, chinchilla, chatgpt-related papers, nerf, flamingo, RETRO&#x2F;some retrieval sota
评论 #34645578 未加载
评论 #34645577 未加载
username3超过 2 年前
They asked on Twitter and he didn’t reply. We need someone with a blue check mark to ask. <a href="https:&#x2F;&#x2F;twitter.com&#x2F;ifree0&#x2F;status&#x2F;1620855608839897094" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;ifree0&#x2F;status&#x2F;1620855608839897094</a>
评论 #34647255 未加载
winwhiz超过 2 年前
I had read that somewhere else and this is as far as I got<p><a href="https:&#x2F;&#x2F;twitter.com&#x2F;id_aa_carmack&#x2F;status&#x2F;1241219019681792010" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;id_aa_carmack&#x2F;status&#x2F;1241219019681792010</a>
throwaway4837超过 2 年前
Wow, crazy coincidence that you all read this article yesterday too. I was thinking of emailing one of them for the list, then I fell asleep. Cold emails to scientists generally have a higher success-rate than average in my experience.
cloudking超过 2 年前
Ilya&#x27;s publications may be on the list <a href="https:&#x2F;&#x2F;scholar.google.com&#x2F;citations?user=x04W_mMAAAAJ&amp;hl=en" rel="nofollow">https:&#x2F;&#x2F;scholar.google.com&#x2F;citations?user=x04W_mMAAAAJ&amp;hl=en</a>
daviziko超过 2 年前
I wonder what would Ilya Sutskever would recommend as an updated list nowadays. I don&#x27;t have a twitter account, otherwise I&#x27;d ask him myself :)
Phil_Latio超过 2 年前
Not in the list: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;1805.09001.pdf" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;1805.09001.pdf</a>
evc123超过 2 年前
<a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2210.14891" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2210.14891</a>
adt超过 2 年前
<a href="https:&#x2F;&#x2F;lifearchitect.ai&#x2F;papers&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lifearchitect.ai&#x2F;papers&#x2F;</a>
vikashrungta超过 2 年前
I posted a list of papers on twitter, and will be posting a summary for each of them as well. here is the list <a href="https:&#x2F;&#x2F;twitter.com&#x2F;vrungta&#x2F;status&#x2F;1623343807227105280" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;vrungta&#x2F;status&#x2F;1623343807227105280</a><p>Unlocking the Secrets of AI: A Journey through the Foundational Papers by @vrungta (2023)<p>1. &quot;Attention is All You Need&quot; (2017) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1706.03762" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1706.03762</a> (Google Brain) 2. &quot;Generative Adversarial Networks&quot; (2014) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1406.2661" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1406.2661</a> (University of Montreal) 3. &quot;Dynamic Routing Between Capsules&quot; (2017) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1710.09829" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1710.09829</a> (Google Brain) 4. &quot;Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks&quot; (2016) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1511.06434" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1511.06434</a> (University of Montreal) 5. &quot;ImageNet Classification with Deep Convolutional Neural Networks&quot; (2012) - <a href="https:&#x2F;&#x2F;papers.nips.cc&#x2F;paper&#x2F;4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf" rel="nofollow">https:&#x2F;&#x2F;papers.nips.cc&#x2F;paper&#x2F;4824-imagenet-classification-wi...</a> (University of Toronto) 6. &quot;BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding&quot; (2018) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1810.04805" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1810.04805</a> (Google) 7. &quot;RoBERTa: A Robustly Optimized BERT Pretraining Approach&quot; (2019) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1907.11692" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1907.11692</a> (Facebook AI) 8. &quot;ELMo: Deep contextualized word representations&quot; (2018) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1802.05365" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1802.05365</a> (Allen Institute for Artificial Intelligence) 9. &quot;Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context&quot; (2019) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1901.02860" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1901.02860</a> (Google AI Language) 10. &quot;XLNet: Generalized Autoregressive Pretraining for Language Understanding&quot; (2019) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1906.08237" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1906.08237</a> (Google AI Language) 11. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer&quot; (2020) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1910.10683" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1910.10683</a> (Google Research) 12. &quot;Language Models are Few-Shot Learners&quot; (2021) - <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2005.14165" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2005.14165</a> (OpenAI)
theusus超过 2 年前
like papers are that comprehensible.
mgaunard超过 2 年前
In my experience, all deep learning is overhyped, and most needs that are not already addressable by linear regressions can be done so with simple supervised learning.