TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Elements of Differentiable Programming

131 点作者 leephillips大约 1 年前

8 条评论

MikeBattaglia大约 1 年前
One very interesting thing about automatic differentiation is that you can think of it as involving a new algebra, similar to the complex numbers, where we adjoin an extra element to the reals to form a plane. This new algebra is called the ring of &quot;dual numbers.&quot; The difference is that instead of adding a new element &quot;i&quot; with i² = -1, we add one called &quot;h&quot; with h² = 0!<p>Every element in the dual numbers is of the form a + bh, and in fact the entire ring can be turned into a totally ordered ring in a very natural way: simply declare h &lt; r for any real r &gt; 0. In essence, we are saying h is an infinitesimal - so small that its square is 0. So we have a non-Archimedean ring with infinitesimals - the <i>smallest</i> such ring extending the real numbers.<p>Why is this so important? Well, if you have some function f which can be extended to the dual number plane - which many can, similar to the complex plane - we have<p>f(x+h) = f(x) + f&#x27;(x)h<p>Which is little more than restating the usual definition of the derivative: f&#x27;(x) = (f(x+h) - f(x))&#x2F;h<p>For instance, suppose we have f(x) = 2x² - 3x + 1, then<p>f(x+h) = 2(x+h)² - 3(x+h) + 1 = 2(x² + 2xh + h²) - 3(x+h) + 1 = (2x² - 3x + 1) + (4x - 3)h<p>Where the last step just involves rearranging terms and canceling out the h² = 0 term. Note that the expression for the derivative we get, (4x-3), is correct, and magically computed itself straight from the properties of the algebra.<p>In short, just like creating i² = -1 revolutionized algebra, setting h² = 0 revolutionizes calculus. Most autodiff packages (such as Pytorch) use something not much more advanced than this, although there are optimizations to speed it up (e.g. reverse mode diff).
评论 #39796833 未加载
评论 #39795713 未加载
评论 #39795503 未加载
评论 #39795968 未加载
评论 #39795736 未加载
评论 #39801310 未加载
macawfish大约 1 年前
This is amazing! Seems like a perfect excuse to get back into Julia. I just wish Julia had more compile targets. Ideally I&#x27;d like to have the option to target the browser (wasm&#x2F;webgpu).
评论 #39794160 未加载
评论 #39794087 未加载
amelius大约 1 年前
Would this be useful for general applications, or just numerical ones?
评论 #39794487 未加载
ubj大约 1 年前
This looks like a great resource! Differentiable programming is such a cool area. I&#x27;ll never forget writing a differentiable PID controller a few years ago and watching the PID gains get tuned automagically to stabilize the control system. It&#x27;s powerful stuff if you use it in the right places.
评论 #39809534 未加载
dkjaudyeqooe大约 1 年前
This is a timely book. Maybe more interesting (at least to me) than the recent results of AI research is the application of techniques used in the field applied elsewhere.<p>For my work going forward catering to automatic differentiation in the code is a no-brainer.
geor9e大约 1 年前
Why did the deep learning model cross the road? Because it was smooth and differentiable
whoevercares大约 1 年前
TBH I would hope this is a Jax deep dive book
fpgamlirfanboy大约 1 年前
i don&#x27;t know why people write these things. it&#x27;s an absolute hodge-podge of theorem&#x2F;proofs&#x2F;results&#x2F;techniques with no unifying theme other than &quot;CALCULUS&quot;. so it&#x27;s a pretty bad math book to actually learn math from (you can always spot a pedagogically unsound math book by its lack of exercises). the book doesn&#x27;t even have any code in it which is surprising considering it has &quot;programming&quot; in the title.<p>actually i know why people write these but i still don&#x27;t know why they publish them: this is a phase everyone goes through in their &quot;math&quot; life where they look back on everything they&#x27;ve learned hastily between undergrad&#x2F;phd&#x2F;postdoc (or whatever) and they have the urge to formalize&#x2F;crystallize. everyone has the urge - i had a late-career QFT prof tell me that he was excited to take his sabbatical so that he could finally do all of the exercises in peskin&amp;schroeder for real real and type it all up neatly.<p>i&#x27;ve done it too, in-the-small (some very nice notes that i&#x27;m proud of, on various things). you sit down, make your list of things, pull up all of the books&#x2F;papers you&#x27;re going to use as references and you start essentially transcribing - but you tell yourself you&#x27;re putting your own spin on it (adding ample &quot;motivation&quot;). and it&#x27;s all fine and healthy and gratifying <i>for you and yourself alone</i>. but i don&#x27;t think i&#x27;d ever imagine to myself &quot;well hmm my organization of these topics is going to be useful for other people i should put it out there for the world to see&quot;. but that&#x27;s just me.<p>ninja edit:<p>before someone jumps down my throat about &quot;what&#x27;s the harm?&quot;. the harm is n00b&#x2F;undergrads&#x2F;young people&#x2F;etc will look at this and think this is the right way to learn this material and some of them will even make an attempt to learn the material from this thing and they&#x27;ll struggle and fail and be discouraged - i speak from experience! it&#x27;s not a good thing for the community. sure maybe 1 in a 100 can learn this stuff from just reading a monograph (what these things used to be called...) but that&#x27;s the exception that proves the rule.
评论 #39795717 未加载
评论 #39795323 未加载
评论 #39796907 未加载
评论 #39794653 未加载
评论 #39794846 未加载