I took a Machine Learning course this past winter and this article in particular would have been really helpful since I struggled most with this concept in particular (and gradient descent in general). While most resources show you the mechanics of neural networks, none I found were very good at explaining (to me) the purpose and meaning behind them. Sure, I could follow along and eventually figure out how to write my own neural network, and I did, but I honestly never completely understood what was going on. The problem with most ML texts/resources for people like me without a strong math background is that a lot of high-level math is presented without an explanation of what mathematical concepts are being used. I admit that the onus is on me, the math dummy, to go out and learn the concepts involved, but it's difficult to look at a confusing algorithm chock full of unfamiliar concepts and know where to start. This article explains things nicely and I hope to see more like this in ML.
This is a very welcome read but it hypes neural networks a bit. I've been working with in JavaScript using IndexedDB and while researching I was disappointed to find that some smart people seem to think they are much more limited than made out to be here and elsewhere <a href="https://www.youtube.com/watch?v=AyzOUbkUf3M#t=242" rel="nofollow">https://www.youtube.com/watch?v=AyzOUbkUf3M#t=242</a><p>To summarize, people generally abandoned backpropigation trained neural networks for Support Vector Machines because neural nets require labeled and limited datasets, and work slowly and especially so when dealing with multiple layers which is sort of the whole point.<p>In my work in JavaScript, I was able to pull off only a single layer perceptron and it is neat but limited in what it can model.
The best text I've read about backpropagation is in "Neural Networks: A Systematic Introduction" by Raul Rojas [1]<p>He uses a nice graphical approach that is easily understandable yet formal. It's been many years since I've read it to learn for an exam at university but I remember it was an enjoyable read and I wished I had more time to spend on the book.<p>[1] <a href="http://www.amazon.com/Neural-Networks-A-Systematic-Introduction/dp/3540605053/" rel="nofollow">http://www.amazon.com/Neural-Networks-A-Systematic-Introduct...</a>
The backprop idea is one instance of a general idea called reverse accumulation (which you can use in other contexts): <a href="http://www.win-vector.com/blog/2010/07/gradients-via-reverse-accumulation/" rel="nofollow">http://www.win-vector.com/blog/2010/07/gradients-via-reverse...</a>
"Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data"<p>In the context of CS what's the difference between "learning" and optimizing?