TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Why are deep neural networks hard to train?

83 点作者 wxs超过 10 年前

4 条评论

joe_the_user超过 10 年前
So neural networks and support vector machines are essentially equivalent [1]. Thus both these approaches effectively project input into a high level feature-space and then draw a hyperplane between two different point sets. The cleverness or not of this depends on how the algorithm effectively creates the feature-space. The article&#x27;s comments could be interpreted as Deep neural networks allow feature-spaces which otherwise require many more neurons.<p>But thing is, first consider that being divided by a plane in a feature space is simply a convenient quality that many patterns have. It&#x27;s similar to data you can draw a line along to extrapolate further values of. However, unlike that approximately linear data, you can&#x27;t &quot;why&quot; your complex is separated by a particular plane in the feature space and the reason is that your neural network or SVM data is more or less trapper in the model - it&#x27;s not going to be further processed except in using that model for that particular pattern.<p>[1] <a href="http://www.scm.keele.ac.uk/staff/p_andras/PAnpl2002.pdf" rel="nofollow">http:&#x2F;&#x2F;www.scm.keele.ac.uk&#x2F;staff&#x2F;p_andras&#x2F;PAnpl2002.pdf</a>
评论 #8722611 未加载
vonnik超过 10 年前
We&#x27;ve tried to consolidate some training tips here: <a href="http://deeplearning4j.org/debug.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;debug.html</a> <a href="http://deeplearning4j.org/troubleshootingneuralnets.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;troubleshootingneuralnets.html</a> <a href="http://deeplearning4j.org/trainingtricks.html" rel="nofollow">http:&#x2F;&#x2F;deeplearning4j.org&#x2F;trainingtricks.html</a><p>There are many methods. The first to tackle is getting your data in the right format. Plotting software like Matplotlib can be really helpful when you&#x27;re trying to debug.
TheLoneWolfling超过 10 年前
What happens when you, instead of training the entire network at once, train for a while with a single layer, then add a second layer and train with both layers, then add a third layer and train with all three layers, and so on?
评论 #8720831 未加载
yudlejoza超过 10 年前
my recent comment on reddit might be relevant to this:<p><a href="https://www.reddit.com/r/MachineLearning/comments/2oeg5t/backpropagation_as_simple_as_possible_but_no/cmn7vnj" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;2oeg5t&#x2F;bac...</a><p>(Disclaimer: I&#x27;m just a beginner ML&#x2F;DL enthusiast).