TechEcho

6 comments

gamegoblinalmost 11 years ago

A note on dropout:If your layer size is relatively small (not hundreds or thousands of nodes), dropout is usually detrimental and a more traditional regularization method such as weight-decay is superior.For the size networks Hinton et al are playing with nowadays (with thousands of nodes in a layer), dropout is good, though.

评论 #8055169 未加载

vundervulalmost 11 years ago

Who is Arno Candel and why should we pay attention to his tips on training neural networks? Anyone who suggests grid search for metaparameter tuning is out of touch with the consensus among experts in deep learning. A lot of people are coming out of the woodwork and presenting themselves as experts in this exciting area because it has had so much success recently, but most of them seem to be beginners. Having lots of beginners learning is fine and healthy, but a lot of these people act as if they are experts.

评论 #8056129 未加载

agibsoncccalmost 11 years ago

I would just like to link to my comments from before for people who maybe curious:<a href="https://news.ycombinator.com/item?id=7803101" rel="nofollow">https://news.ycombinator.com/item?id=7803101</a>I will also add that looking in to hessian free for training over conjugate gradient/LBFGS/SGD for feed forward nets has proven to be amazing[1].Recursive nets I'm still playing with yet, but based on the work by socher, they used LBFGS just fine.[1]: <a href="http://www.cs.toronto.edu/~rkiros/papers/shf13.pdf" rel="nofollow">http://www.cs.toronto.edu/~rkiros/papers/shf13.pdf</a>[2]: <a href="http://socher.org/" rel="nofollow">http://socher.org/</a>

prajitalmost 11 years ago

A question about the actual slides: why don't they use unsupervised pretraining (i.e. Sparse Autoencoder) for predicting MNIST? Is it just to show that they don't need pretraining to achieve good results or is there something deeper?

评论 #8056088 未加载

TrainedMonkeyalmost 11 years ago

Direct link to slides: <a href="http://www.slideshare.net/0xdata/h2o-distributed-deep-learning-by-arno-candel-071614" rel="nofollow">http://www.slideshare.net/0xdata/h2o-distributed-deep-learni...</a>

ivan_ahalmost 11 years ago

direct link to slides anyone?

评论 #8054858 未加载

6 comments

gamegoblinalmost 11 years ago

评论 #8055169 未加载

vundervulalmost 11 years ago

评论 #8056129 未加载

agibsoncccalmost 11 years ago

prajitalmost 11 years ago

评论 #8056088 未加载

TrainedMonkeyalmost 11 years ago

ivan_ahalmost 11 years ago

direct link to slides anyone?

评论 #8054858 未加载

Tips for Better Deep Learning Models

6 comments

Tips for Better Deep Learning Models

6 comments