Dual Numbers and Automatic Differentiation (2014)

78 pointsby Kristine1975about 8 years ago

14 comments

Atrix256about 8 years ago

BTW this older article of mine is extended with a new one that shows how to handle multiple variables (: <a href="http://blog.demofox.org/2017/02/20/multivariable-dual-numbers-automatic-differentiation/" rel="nofollow">http://blog.demofox.org/2017/02/20/multivariable-dual-number...</a>

评论 #13697634 未加载

one-more-minuteabout 8 years ago

Check out ForwardDiff.jl [1], which is a really impressive implementation of this exact idea. The technique gets pretty magical results; you can apply it to any function, complete with loops, linalg etc, and it will compute a derivative with something like 10% overhead. The standard approach is finite differencing, which involves many-times overhead, isn't exact, and can easily blow up for common pathological cases like step functions.Fede_V's points about the drawbacks of the technique are valid in C++, but Julia's duck-typing makes being generic the default (including in the standard library). ForwardDiff works out of the box, for free, in a huge number of cases.[1] <a href="https://github.com/JuliaDiff/ForwardDiff.jl" rel="nofollow">https://github.com/JuliaDiff/ForwardDiff.jl</a>

评论 #13698231 未加载

Fede_Vabout 8 years ago

This is a great post. However, it didn't touch on the two main problems of AD:- Using Dual Numbers requires that all functions that you call into accept templated parameters. If you want to use GSL, BLAS, or any other mature math library, you are probably out of luck.- Even if you are willing to port the code and modify the functions to accept templated parameters, very highly optimized math libraries make assumption not just about the behaviour of numbers (their API, defined by how they behave under addition/subtraction, etc) but also about their ABI. For example, a well tuned LAPACK like OpenBlas or MKL has very well tuned loop sizes to optimize cache behaviour assuming that floats are of a particular size.

评论 #13696673 未加载

pettersabout 8 years ago

Using dual numbers and modern C++ you can write a library that can do this:<pre><code> auto lambda = [](auto x, auto y) { // The Rosenbrock function. auto d0 = y[0] - x[0]*x[0]; auto d1 = 1 - x[0]; return 100 * d0*d0 + d1*d1; }; // The lambda function will be copied and // automatically differentiated. The derivatives // are computed using templates, not numerically. // // No need to derive or compute derivatives // manually! auto func = make_differentiable<1, 1>(lambda); </code></pre> "func" now has code for first-order, second-order derivatives all generated and heavily optimized at compile-time. This is one reason C++ is so good for (mathematical) optimization.

doublerebelabout 8 years ago

One of the libraries I'm using for computer vision depends on using the inner dual number class Jet [0] from Google's ceres-solver to do automatic differentiation.It was worthwhile research reading through the implementation to understand the applications.[0]: <a href="https://github.com/ceres-solver/ceres-solver/blob/master/include/ceres/jet.h" rel="nofollow">https://github.com/ceres-solver/ceres-solver/blob/master/inc...</a>

divbitabout 8 years ago

Super cool. In this example, it's differentiation of a one-dimensional curve, but one can use the dual numbers to compute tangent spaces of more interesting algebraic objects as well. See, e.g. remark 5.38 here: <a href="http://www.jmilne.org/math/CourseNotes/AG500.pdf" rel="nofollow">http://www.jmilne.org/math/CourseNotes/AG500.pdf</a>

jmountabout 8 years ago

Dual numbers are a blast, especially with type-templated languages. I wrote a Scala implementation some time ago that I would like to share: <a href="http://www.win-vector.com/blog/2010/06/automatic-differentiation-with-scala/" rel="nofollow">http://www.win-vector.com/blog/2010/06/automatic-differentia...</a>

BoppreHabout 8 years ago

I've implemented this in Python a while ago, also (ab)using operator overload: <a href="https://github.com/boppreh/derivative" rel="nofollow">https://github.com/boppreh/derivative</a>Not remarkable, but works.<pre><code> f = lambda x: x * 5 + x ** 2 - 2 / x + 3 / x ** 2 print(derive(f, 6))</code></pre>

评论 #13699667 未加载

zardoabout 8 years ago

An extension to multiple dimensions is mentioned. This would be exactly geometric algebra, wouldn't it?

评论 #13696344 未加载

inlineintabout 8 years ago

I don't like this use of dual numbers notation for differentiation because division by a dual number is not defined. For example, how would one calculate ε^2/ε? If ε is a matrix [0,1;0,0] then it doesn't have inverse and thus the expression could not be evaluated.On the other hand, little-o notation [1] was invented for exactly this purpose. It is easy to evaluate derivatives using it, for example (x+ε)^3 = x^3+3x^2ε+o(ε), and so ((x+ε)^3 - x^3)/ε = 3*x^2 + o(1), and o(1) tends to 0 for ε tends to 0.[1] <a href="https://en.wikipedia.org/wiki/Big_O_notation#Little-o_notation" rel="nofollow">https://en.wikipedia.org/wiki/Big_O_notation#Little-o_notati...</a>

评论 #13696373 未加载

评论 #13696683 未加载

frankohnabout 8 years ago

The article is interesting but one have to be aware that dual number are not a solution to automatic differentiation.The reason is that to have automatic differentiation one needs to keep in principle the epsilon^n for any order and the dual number just take epsilon^2 = 0 which is an strong limitation.For example with dual number you can say that:limit(x->0) (sin(x) - 1) / x = 1just by doing x = epsilon but they will fail with:limit(x->0) (cos(x) - 1) / x^2 = -1/2because the quadratic terms you need will go to zero for x = epsilon.

评论 #13698543 未加载

评论 #13699020 未加载

tehsauceabout 8 years ago

Anyone else know this guy from his shadertoy profile?

评论 #13697959 未加载

erichoceanabout 8 years ago

I'm not a math guy, but I've gotten incredible usage out of Dual numbers over the years. Highly recommended.

评论 #13696093 未加载

empath75about 8 years ago

How would this work with second, third, etc, derivatives?

评论 #13699490 未加载

评论 #13704271 未加载