TechEcho

8 comments

PieSquaredalmost 11 years ago

In addition to issues raised by other commenters, one of the problems with deep learning (deep nets in general) is that they can be very hard to train. If you're interested in some techniques people have been using, I highly suggested you read up on optimization methods such as conjugate gradient and hessian-free optimization. I did this recently [0] and have a brief write-up, but honestly the original Martens paper may be more understandable [1].[0] <a href="http://andrew.gibiansky.com/blog/machine-learning/hessian-free-optimization/" rel="nofollow">http://andrew.gibiansky.com/blog/machine-learning/hessian-fr...</a>[1] <a href="http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf" rel="nofollow">http://machinelearning.wustl.edu/mlpapers/paper_files/icml20...</a>

cjrdalmost 11 years ago

Hi, I'm one of the creators of Metacademy. I hope you find it useful. Feel free to follow our new Twitter account if you'd like low volume updates:<a href="https://twitter.com/meta_learning" rel="nofollow">https://twitter.com/meta_learning</a>Also, you can register an account for an occasional email.PS) We're completely free and open source: <a href="https://github.com/metacademy/metacademy-application" rel="nofollow">https://github.com/metacademy/metacademy-application</a>

nrmnalmost 11 years ago

For anyone actually interested in implementing DNN's I wrote up a quick blog post (essentially a brain dump) of general guidelines to adhere to when training DNNs. The source for this information is primarily from videos given by Geoffrey Hinton as well as various papers.<a href="http://343hz.com/general-guidelines-for-deep-neural-networks/" rel="nofollow">http://343hz.com/general-guidelines-for-deep-neural-networks...</a>

jarvicalmost 11 years ago

I just skimmed the post as I don't have time to fully read it right now, but I'll point out a couple of problems that you can run into with neural nets and associated approaches.One issue that can be a back breaker depending on your application is that, to produce a generalizable model, nets tend to need much more training data than the alternatives. There are ways to work around this, though.The bigger problem to me is interpretability. Deep learning often gives feature sets that are very good for whatever task you are working on, they are in some senses artificial and it is difficult to relate changes in features to changes in the input data. I work with a lot of biological and medical data, and this is an issue because for some applications it is important not to just get accurate classification results, but to be able to understand what your features mean in the context of the original problem. I saw some interesting work in a computer vision paper earlier this year on trying to learn how to visualize how changes in input and outputs of a neural net were related, I'll try to dig that up later if anyone is interested.I'm not sure how coherent that was as I was trying to get this typed out in a hurry.

评论 #7804560 未加载

agibsoncccalmost 11 years ago

To address some of the comments being presented here, neural nets despite being harder to train can be debugged visually.A few tips for those of you who use neural nets:Debug the weights with histograms. Track the gradient and make sure the magnitude is not too large and its normally distributed.Keep track of your gradient changes when using either gradient descent or conjugate gradient.Plot your filters, visualize what each neuron is learning.Watch the rate of change of your cost function. If it seems like its changing too fast and stops early lower your learning rate.Plot your activations: if they start out grey you're fine. If you start all black, you need to retune some of your parameters.Lastly, understand the algorithm you're using. Convolutional nets are different from recursive neural tensor are different denoising autoencoders are different from RBMS/DBNs.Pay attention to your cost function, reconstruction entropy is used differently from negative log likelihood is used differently for different objectives.If you are trying to do feature learning, you are using RBMs, Denoising AutoEncoders and you will use reconstruction entropy. This is what you use for feature detectors. You may end up using negative log likelihood if you are dealing with continuous data.For RBMs, pay attention to the different kinds of units[1]. Hinton recommends Gaussian visible with recitifed linear for continuous data, binary binary otherwise.For denoising autoencoders, watch your corruption level. A higher one helps generalize better, especially with less data.For time series or sequential data, you can use a recurrent net,moving window with DBNs, or recursive neural tensorOther knobs:If your deep learning framework doesn't have adagrad find one that does.Dropout: crucial. Dropout is used in combination with mini batch learning to handle learning different "poses" of images as well as generalizing feature learning. This can be used in combination with sampling with replacement to minimize sampling error.Regularization: L2 is typically used. Hinton once said: you want a neural net that always overfits but is regularized (youtube video...don't remember link right now).Would love to answer questions! Source: I work on/teach this stuff. Still working my way up there, but it seems to be going well so far.[2][3]Lastly, tweak one knob at a time. Neural nets have a lot going on. You don't want a situation where you A/B tested 10 different parameters at once and you don't know which one worked or why.[1]: <a href="http://www.cs.toronto.edu/~hinton/absps/guideTR.pdf" rel="nofollow">http://www.cs.toronto.edu/~hinton/absps/guideTR.pdf</a>[2]: <a href="http://deeplearning4j.org/" rel="nofollow">http://deeplearning4j.org/</a>[3]: <a href="http://zipfianacademy.com/" rel="nofollow">http://zipfianacademy.com/</a>[4]: <a href="http://arxiv.org/abs/1206.5533" rel="nofollow">http://arxiv.org/abs/1206.5533</a> <a href="http://deeplearning4j.org/" rel="nofollow">http://deeplearning4j.org/</a> <a href="http://deeplearning4j.org/debug.html" rel="nofollow">http://deeplearning4j.org/debug.html</a> <a href="http://yosinski.com/media/papers/Yosinski2012VisuallyDebuggingRestrictedBoltzmannMachine.pdf" rel="nofollow">http://yosinski.com/media/papers/Yosinski2012VisuallyDebuggi...</a>

评论 #7805670 未加载

评论 #7818886 未加载

tshadwellalmost 11 years ago

I'm pretty familiar with neural networks, and skimming that article it appears to describe something that is a neural network. Is 'Deep Learning' new terminology for 'Neural Network', or does it describe a subset of ways of using them?

评论 #7805918 未加载

评论 #7804577 未加载

sytelusalmost 11 years ago

THANK YOU for this link. Meta Academy is amazing! I always wanted a tool like this which tells me graph of concepts I need to learn first before I can learn X. I wish we had this kind of learning plan graph for other fields as well.

ellipticalmost 11 years ago

Anyone know of good papers relating to deep learning that are not from image classification or speech/text recognition?

评论 #7803091 未加载

评论 #7802661 未加载

8 comments

PieSquaredalmost 11 years ago

cjrdalmost 11 years ago

nrmnalmost 11 years ago

jarvicalmost 11 years ago

评论 #7804560 未加载

agibsoncccalmost 11 years ago

评论 #7805670 未加载

评论 #7818886 未加载

tshadwellalmost 11 years ago

评论 #7805918 未加载

评论 #7804577 未加载

sytelusalmost 11 years ago

ellipticalmost 11 years ago

Anyone know of good papers relating to deep learning that are not from image classification or speech/text recognition?

评论 #7803091 未加载

评论 #7802661 未加载

Deep Learning From The Bottom Up

8 comments

Deep Learning From The Bottom Up

8 comments