科技回声

6 条评论

gamegoblin将近 11 年前

A note on dropout:If your layer size is relatively small (not hundreds or thousands of nodes), dropout is usually detrimental and a more traditional regularization method such as weight-decay is superior.For the size networks Hinton et al are playing with nowadays (with thousands of nodes in a layer), dropout is good, though.

评论 #8055169 未加载

vundervul将近 11 年前

Who is Arno Candel and why should we pay attention to his tips on training neural networks? Anyone who suggests grid search for metaparameter tuning is out of touch with the consensus among experts in deep learning. A lot of people are coming out of the woodwork and presenting themselves as experts in this exciting area because it has had so much success recently, but most of them seem to be beginners. Having lots of beginners learning is fine and healthy, but a lot of these people act as if they are experts.

评论 #8056129 未加载

agibsonccc将近 11 年前

I would just like to link to my comments from before for people who maybe curious:<a href="https://news.ycombinator.com/item?id=7803101" rel="nofollow">https://news.ycombinator.com/item?id=7803101</a>I will also add that looking in to hessian free for training over conjugate gradient/LBFGS/SGD for feed forward nets has proven to be amazing[1].Recursive nets I'm still playing with yet, but based on the work by socher, they used LBFGS just fine.[1]: <a href="http://www.cs.toronto.edu/~rkiros/papers/shf13.pdf" rel="nofollow">http://www.cs.toronto.edu/~rkiros/papers/shf13.pdf</a>[2]: <a href="http://socher.org/" rel="nofollow">http://socher.org/</a>

prajit将近 11 年前

A question about the actual slides: why don't they use unsupervised pretraining (i.e. Sparse Autoencoder) for predicting MNIST? Is it just to show that they don't need pretraining to achieve good results or is there something deeper?

评论 #8056088 未加载

TrainedMonkey将近 11 年前

Direct link to slides: <a href="http://www.slideshare.net/0xdata/h2o-distributed-deep-learning-by-arno-candel-071614" rel="nofollow">http://www.slideshare.net/0xdata/h2o-distributed-deep-learni...</a>

Tips for Better Deep Learning Models

6 条评论

Tips for Better Deep Learning Models

6 条评论