TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

An illustrated guide to automatic sparse differentiation

137 点作者 mariuz大约 1 个月前

8 条评论

gwf大约 1 个月前
Not trying to &quot;Schmidhuber&quot; this or anything, but I think my 1999 NIPS paper gives a cleaner derivation and explanation for working on the Jacobian. In it, I derive a Jacobian operator that allows you to compute arbitrary products between the Jacobian and any vector, with complexity that is comparable to standard backprop.<p>[*] G.W. Flake &amp; B.A. Pearlmutter, &quot;Differentiating Functions of the Jacobian with Respect to the Weights,&quot; <a href="https:&#x2F;&#x2F;proceedings.neurips.cc&#x2F;paper_files&#x2F;paper&#x2F;1999&#x2F;file&#x2F;b9f94c77652c9a76fc8a442748cd54bd-Paper.pdf" rel="nofollow">https:&#x2F;&#x2F;proceedings.neurips.cc&#x2F;paper_files&#x2F;paper&#x2F;1999&#x2F;file&#x2F;b...</a>
rdyro大约 1 个月前
A really cool post and a great set of visualizations!<p>Computing sparse Jacobians can save a lot of compute if there&#x27;s a real lack of dependency between part of the input and the output. Discovering this automatically through coloring is very appealing.<p>Another alternative is to implement sparse rules for each operation yourself, but that often requires custom autodiff implementations which aren&#x27;t easy to get right, I wrote a small toy version of a sparse rules-based autodiff here: <a href="https:&#x2F;&#x2F;github.com&#x2F;rdyro&#x2F;SpAutoDiff.jl">https:&#x2F;&#x2F;github.com&#x2F;rdyro&#x2F;SpAutoDiff.jl</a><p>Another example (a much more serious one) is <a href="https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;folx">https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;folx</a>
评论 #43842326 未加载
whitten大约 1 个月前
This paper is written by three Europeans who clearly understand these mathematical ideas.<p>Is this type of analysis a part of a particular mathematical heritage ?<p>What would it be called ?<p>Is this article relevant ? <a href="https:&#x2F;&#x2F;medium.com&#x2F;@lobosi&#x2F;calculus-for-machine-learning-jacobians-and-hessians-816ef9d55a39" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;@lobosi&#x2F;calculus-for-machine-learning-jac...</a>
评论 #43842092 未加载
评论 #43841763 未加载
评论 #43842177 未加载
评论 #43842653 未加载
评论 #43841551 未加载
评论 #43841321 未加载
FilosofumRex大约 1 个月前
The classic reference on the subject is &quot;Numerical Linear Algebra&quot; by Lloyd Trefethen. Skip to the last chapter on the iterative methods for computational aspects. You&#x27;ll learn a lot more and faster with Matlab.<p><a href="https:&#x2F;&#x2F;davidtabora.wordpress.com&#x2F;wp-content&#x2F;uploads&#x2F;2015&#x2F;01&#x2F;lloyd_n-_trefethen_david_bau_iii_numerical_line.pdf" rel="nofollow">https:&#x2F;&#x2F;davidtabora.wordpress.com&#x2F;wp-content&#x2F;uploads&#x2F;2015&#x2F;01...</a><p>A short overview is chapter 11 in Gilbert Strangs&#x27;s Intro to linear Algebra <a href="https:&#x2F;&#x2F;math.mit.edu&#x2F;~gs&#x2F;linearalgebra&#x2F;ila5&#x2F;linearalgebra5_11-1.pdf" rel="nofollow">https:&#x2F;&#x2F;math.mit.edu&#x2F;~gs&#x2F;linearalgebra&#x2F;ila5&#x2F;linearalgebra5_1...</a><p>AD comes from a different tradition - dating back to FORTRAN 77 programers attempt to differentiate non-elementary functions (For Loops, procedural functions, Subroutines, etc). Note the hardware specs for some nostalgia <a href="https:&#x2F;&#x2F;www.mcs.anl.gov&#x2F;research&#x2F;projects&#x2F;adifor&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.mcs.anl.gov&#x2F;research&#x2F;projects&#x2F;adifor&#x2F;</a>
nathan_douglas大约 1 个月前
Picking my way through this slowly... I&#x27;m familiar with autodiff but some of these ideas are very new to me. This seems really, really exciting though.
patrick451大约 1 个月前
The optimal control framework Casadi has had the ability to compute sparse jacobians and hessians for a long time (maybe a decade?), which come up all the time in trajectory optimization.This not only provides massive speed ups in both the differentiation and linear solver time, but also greatly reduces the memory requirements. If this catches on in machine learning, it will be interesting to see if we can finally move past first order optimization methods.
评论 #43877790 未加载
goosedragons大约 1 个月前
There is automatic sparse differentiation available in the R ecosystem. That&#x27;s what the RTMB &amp; TMB packages do.
评论 #43848475 未加载
oulipo大约 1 个月前
Sparsely-related question: is the blog style&#x2F;css open-source?
评论 #43841849 未加载