TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Gradients are not all you need

163 点作者 bundie大约 2 年前

6 条评论

modeless大约 2 年前
Seems to me like the whole history of neural nets is basically crafting models with well-behaved gradients to make gradient descent work well. That, and models that can achieve high utilization of available hardware. The surprising thing is that models exist where the gradients are <i>so</i> well-behaved that we can learn GPT-4 level stuff.
评论 #35679733 未加载
评论 #35683112 未加载
0xBABAD00C大约 2 年前
&gt; Gradients Are Not All You Need<p>Sometimes you need to peek at the Hessian.<p>Seriously though, what is intelligence if not creative unrolling of the first few terms of the Taylor expansion?
评论 #35681843 未加载
Der_Einzige大约 2 年前
Global optimization techniques which don&#x27;t rely on gradients seems theoretically superior in all instances, except that we haven&#x27;t found super fast ways to run these kinds of optimizers.<p>The cartpoll demo famously tripped up derivative based reinforcement learning for awhile.
评论 #35680201 未加载
评论 #35683169 未加载
msackmann大约 2 年前
Interesting paper, thanks for bringing this up! I have been working on methods for trajectory optimization using both, analytic gradient computations and black box stochastic gradient approximations (proximal policy optimization).<p>I was always wondering about a question that is touched in the paper: despite the analytic gradient computation being intuitively more efficient and mathematically correct, it is much harder to learn a policy with it than with the “brute force trial-and-error” black box methods.<p>This paper brings many new perspectives on why.
asdfman123大约 2 年前
&gt; chaos based failure mode<p>I studied this in undergrad, but it’s not the same thing the paper is talking about
unlikelymordant大约 2 年前
My one wish is that machine learning papers would use paper titles that actually described what the paper was about. I suppose there is a certain &#x27;evolutionary pressure&#x27; where clever titles &#x27;outcompete&#x27; dryer, more descriptive titles (or it seems that way). But i don&#x27;t like it.
评论 #35678536 未加载
评论 #35680365 未加载
评论 #35678445 未加载
评论 #35679920 未加载
评论 #35678497 未加载
评论 #35678388 未加载
评论 #35678628 未加载
评论 #35679400 未加载
评论 #35678375 未加载
评论 #35695915 未加载
评论 #35679397 未加载
评论 #35679578 未加载