TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Why are deep neural networks hard to train?

83 pointsby wxsover 10 years ago

4 comments

joe_the_userover 10 years ago
So neural networks and support vector machines are essentially equivalent [1]. Thus both these approaches effectively project input into a high level feature-space and then draw a hyperplane between two different point sets. The cleverness or not of this depends on how the algorithm effectively creates the feature-space. The article&#x27;s comments could be interpreted as Deep neural networks allow feature-spaces which otherwise require many more neurons.<p>But thing is, first consider that being divided by a plane in a feature space is simply a convenient quality that many patterns have. It&#x27;s similar to data you can draw a line along to extrapolate further values of. However, unlike that approximately linear data, you can&#x27;t &quot;why&quot; your complex is separated by a particular plane in the feature space and the reason is that your neural network or SVM data is more or less trapper in the model - it&#x27;s not going to be further processed except in using that model for that particular pattern.<p>[1] <a href="http://www.scm.keele.ac.uk/staff/p_andras/PAnpl2002.pdf" rel="nofollow">http:&#x2F;&#x2F;www.scm.keele.ac.uk&#x2F;staff&#x2F;p_andras&#x2F;PAnpl2002.pdf</a>
评论 #8722611 未加载
vonnikover 10 years ago
We&#x27;ve tried to consolidate some training tips here: <a href="http://deeplearning4j.org/debug.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;debug.html</a> <a href="http://deeplearning4j.org/troubleshootingneuralnets.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;troubleshootingneuralnets.html</a> <a href="http://deeplearning4j.org/trainingtricks.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;trainingtricks.html</a><p>There are many methods. The first to tackle is getting your data in the right format. Plotting software like Matplotlib can be really helpful when you&#x27;re trying to debug.
TheLoneWolflingover 10 years ago
What happens when you, instead of training the entire network at once, train for a while with a single layer, then add a second layer and train with both layers, then add a third layer and train with all three layers, and so on?
评论 #8720831 未加载
yudlejozaover 10 years ago
my recent comment on reddit might be relevant to this:<p><a href="https://www.reddit.com/r/MachineLearning/comments/2oeg5t/backpropagation_as_simple_as_possible_but_no/cmn7vnj" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;2oeg5t&#x2F;bac...</a><p>(Disclaimer: I&#x27;m just a beginner ML&#x2F;DL enthusiast).