TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Differentiable Programming – A Simple Introduction

159 pointsby dylanbfoxabout 3 years ago

7 comments

infogulchabout 3 years ago
The most interesting thing I&#x27;ve seen on AD is &quot;The simple essence of automatic differentiation&quot; (2018) [1]. See past discussion [2], and talk [3]. I think the main idea is that by compiling to categories and pairing up a function with its derivative, the pair becomes trivially composable in forward mode, and the whole structure is easily converted to reverse mode afterwards.<p>[1]: <a href="https:&#x2F;&#x2F;dl.acm.org&#x2F;doi&#x2F;10.1145&#x2F;3236765" rel="nofollow">https:&#x2F;&#x2F;dl.acm.org&#x2F;doi&#x2F;10.1145&#x2F;3236765</a><p>[2]: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=18306860" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=18306860</a><p>[3]: Talk at Microsoft Research: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ne99laPUxN4" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ne99laPUxN4</a> Other presentations listed here: <a href="https:&#x2F;&#x2F;github.com&#x2F;conal&#x2F;essence-of-ad" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;conal&#x2F;essence-of-ad</a>
评论 #31018084 未加载
yauneyzabout 3 years ago
My professor has talked about this. He thinks that the real gem of the deep learning revolution is the ability to take the derivative of arbitrary code and use that to optimize. Deep learning is just one application of that, but there are tons more.
评论 #31016005 未加载
评论 #31016029 未加载
评论 #31017667 未加载
choegerabout 3 years ago
Nice article, but the intro is a little lengthy.<p>I have one remark, though: If your language allows for automatic differentiation already, why do you bother with a neural network in the first place?<p>I think you should have a good reason why you choose a neural network for your approximation of the inverse function and why it has exactly that amount of layers. For instance, why shouldn&#x27;t a simple polynomial suffice? Could it be that your neural network ends up as an approximation of the Taylor expansion of your inverse function?
评论 #31025145 未加载
评论 #31018617 未加载
评论 #31022511 未加载
评论 #31023115 未加载
评论 #31023611 未加载
PartiallyTypedabout 3 years ago
The nice thing about differentiable programming is that we can use all sorts of different optimizers compared to gradient descent that can offer quadratic convergence instead of linear!
评论 #31015995 未加载
评论 #31017369 未加载
fennecsabout 3 years ago
Does someone have an example where the ability to “differentiate” a program gets you something interesting?<p>I understand perfectly what it means for a neural network, but how about more abstract things.<p>Im not even sure as currently presented, the implementation actually means something. What is the derivative of a function like List, or Sort or GroupBy etc? These articles all assume that somehow it just looks like derivative from calculus somehow.<p>Approximating everything as some non smooth real function doesn’t seem entirely morally correct. A program is more discrete or synthetic. I think it should be a bit more algebraic flavoured, like differentials over a ring.
fghorowabout 3 years ago
At first glance, this approach appears to re-invent an applied mathematics approach to optimal control. There, one writes a generalized Hamiltonian, from which forward and backward-in-time paths can be iterated.<p>The Pontryagin maximum (or minumum, if you define your objective function with a minus sign) principle is the essence to that approach to optimal control.
评论 #31023213 未加载
nooberminabout 3 years ago
The article is okay but it would have helped to have labelled the axes of the graphs.