I have to be honest, I don't think this is a good explanation. I don't know what differential programming is, but I'm fairly sure I have the mathematical background to understand it. But I didn't come away from this article with any confidence that I'm following along.<p>On a superficial level it seems like it:<p>1. Generalizes deep learning to an optimization function on decomposable input, and<p>2. Reduces the number of parameters required to learn the input by exploiting the structure of the input, thereby making learning more efficient.<p>Is that correct? Is it completely off? What am I missing? Is there any more meat to the article than this?<p>Could someone who has upvoted this (and ideally understands the topic well) provide a different explanation of the concept? It would be great if I could see a real world example (even a relatively trivial one) represented in both the traditional matrix computation form and the sexy new differentiable form.
Differentiable does not mean easy to optimize.
One could imagine implementing sha-256 using differentiable operators, and yet the system as a whole would not be optimizable at all.
It would be interesting to have compilers that optimize the "optimizability" of differentiable programs tho...<p>Also, here are two interesting examples of differentiation through physical systems for classification:<p><a href="https://arxiv.org/pdf/1808.08412.pdf" rel="nofollow">https://arxiv.org/pdf/1808.08412.pdf</a><p><a href="https://innovate.ee.ucla.edu/wp-content/uploads/2018/07/2018-optical-ml-neural-network.pdf" rel="nofollow">https://innovate.ee.ucla.edu/wp-content/uploads/2018/07/2018...</a>
Could someone list some practical examples where Differential Programming would be useful?<p>I am familiar where Nueral Networks and Convolutional Networks have done well especially around image processing etc.<p>But I can’t imagine where having differential code would help unless it is just tying multiple neural networks together in a continuous chain of differentiation.<p>For most programming tasks, I can’t imagine how differentiation would be possible or beneficial.<p>Is there a possibility that one could start with a series of unit tests and partial results and through gradient descent actually arrive at additional passing test cases? Most of the time in my experience, passing additional test cases like this requires significantly more complex structures that would not be found via differentiation.
Something I don't understand about Automatic Differentiation is: Why not use a Computer Algebra System instead for generating derivatives of given functions?
I found this paper that helps answer the question: <a href="https://arxiv.org/pdf/1803.10228.pdf" rel="nofollow">https://arxiv.org/pdf/1803.10228.pdf</a>
Be cautious using this article to try to learn anything. Differentiable programming is not actually related to deep learning; it's another word for automatic differentiation, a technique that is very important in deep learning implementations but useful for a variety of other tasks where having gradients available for arbitrary functions is useful.<p>The article is correct that "Differentiable Programming" seems to be a rebranding effort that I believe just helped automatic differentiation work from the machine learning world get published in Programming Languages journals. I wouldn't read too much into it.