So neural networks and support vector machines are essentially equivalent [1]. Thus both these approaches effectively project input into a high level feature-space and then draw a hyperplane between two different point sets. The cleverness or not of this depends on how the algorithm effectively creates the feature-space. The article's comments could be interpreted as Deep neural networks allow feature-spaces which otherwise require many more neurons.<p>But thing is, first consider that being divided by a plane in a feature space is simply a convenient quality that many patterns have. It's similar to data you can draw a line along to extrapolate further values of. However, unlike that approximately linear data, you can't "why" your complex is separated by a particular plane in the feature space and the reason is that your neural network or SVM data is more or less trapper in the model - it's not going to be further processed except in using that model for that particular pattern.<p>[1] <a href="http://www.scm.keele.ac.uk/staff/p_andras/PAnpl2002.pdf" rel="nofollow">http://www.scm.keele.ac.uk/staff/p_andras/PAnpl2002.pdf</a>
We've tried to consolidate some training tips here:
<a href="http://deeplearning4j.org/debug.html" rel="nofollow">http://deeplearning4j.org/debug.html</a>
<a href="http://deeplearning4j.org/troubleshootingneuralnets.html" rel="nofollow">http://deeplearning4j.org/troubleshootingneuralnets.html</a>
<a href="http://deeplearning4j.org/trainingtricks.html" rel="nofollow">http://deeplearning4j.org/trainingtricks.html</a><p>There are many methods. The first to tackle is getting your data in the right format. Plotting software like Matplotlib can be really helpful when you're trying to debug.
What happens when you, instead of training the entire network at once, train for a while with a single layer, then add a second layer and train with both layers, then add a third layer and train with all three layers, and so on?
my recent comment on reddit might be relevant to this:<p><a href="https://www.reddit.com/r/MachineLearning/comments/2oeg5t/backpropagation_as_simple_as_possible_but_no/cmn7vnj" rel="nofollow">https://www.reddit.com/r/MachineLearning/comments/2oeg5t/bac...</a><p>(Disclaimer: I'm just a beginner ML/DL enthusiast).