TechEcho

6 comments

My naïve understanding of deep learning is that it works by finding patterns in the answers, instead of actually solving problems.If I take a multiple-choice exam and always answer "C", then I have a good chance at getting more than 25%.For image recognition, I think the classifier is doing the real work (trying to actually answer the question), and the deep learning is just seeing if the answer matches the pattern of expected answers.Somehow, this actually works. I think that it's because true randomness is hard to find.The problem that I've found is that it's really difficult to teach deep learning. I'm making a Chinese-English teaching tool ( <a href="http://pingtype.github.io" rel="nofollow">http://pingtype.github.io</a> ) and sourcing my translations from Google Translate. I find a lot of mistakes in my dictionary that obviously came from Google's model getting the word spacing wrong. I can fix it in my own dictionary immediately. If I submit the correction to Google, it just changes some weightings, and hundreds of people will have to submit the same correction before their deep learning will finally catch on that it needs to change something.

评论 #14734398 未加载

评论 #14733070 未加载

评论 #14733312 未加载

评论 #14733207 未加载

评论 #14734261 未加载

评论 #14733990 未加载

评论 #14733060 未加载

评论 #14733295 未加载

评论 #14733898 未加载

评论 #14734762 未加载

nilknalmost 8 years ago

I was expecting more discussion of alternatives.For instance, in cases where deep neural networks aren't desirable or don't outperform classical approaches, I'm a big fan of boosted decision trees, due to their accuracy on many real-world datasets, their ease-of-use, and the existence of great open source implementations. xgboost (which routinely wins Kaggle competitions) and Spark MLLib both have high-performance distributed training algorithms for gradient boosted trees. And as far as hyperparameter searches go, there just aren't as many parameters to optimize. (And frameworks like Spark are already fantastic for embarrassingly parallel tasks like hyperparameter searches.)

评论 #14733865 未加载

therajivalmost 8 years ago

The author discusses how linear models are generally more interpretable than deep learning methods, but I'd argue that's actually changing pretty quickly. Especially for large image/sequence inputs (which covers most of the applications that are getting hyped up), linear regressions don't perform very well, and often that performance difference prevents them from picking out important features. Given that fast, scalable methods for feature importance are on the rise (e.g. <a href="https://arxiv.org/abs/1704.02685" rel="nofollow">https://arxiv.org/abs/1704.02685</a>, which the author mentions), you often get equally interpretable feature scores from deep models that are more accurate than analogous ones from linear models.Basically, my point is that model interpretation strongly depends on how accurate your model is, and because deep learning models are so much better than linear models for some tasks, it makes sense to use them - even if your primary goal is interpretability.That said, I do believe that if you ever care at all about interpretation, you should almost never be using multilayer perceptrons (which have recently become part of the widening umbrella term "deep learning"), because they rarely work better than decision tree models or basic linear models (and MLPs are generally less or equally as interpretable when compared to traditional methods).

评论 #14734289 未加载

评论 #14733241 未加载

andreykalmost 8 years ago

"The point is that training deep nets carries a big cost, in both computational and debugging time. Such expense doesn’t make sense for lots of day-to-day prediction problems and the ROI of tweaking a deep net to them, even when tweaking small networks, might be too low. "As a Masters student now training deep models for a little while now, I think this point is underemphasized. Doing something novel (so, not just image classification) requires a TON of engineering, not to mention the research considerations. And there are so many tiny decisions and hyperparameters, that even when I thought I had considerable domain knowledge I found it very lacking. I guess it should not be surprising given that 'Deep Learning' refers to a very broad set of models only related by having a learned hierarchical representation. There are a few problems where you can use existing deep learning almost off the shelf (most notably image classification, segmentation), but for most applications I think we're not there yet. As long as this remains true (which I suspect will be for a long time), SVMs and decision trees and linear models are still definitely worth knowing and understanding.

digitalzombiealmost 8 years ago

If NN was able to do small data then are they better than their counter parts?I mean if you can do it for small data and it was good then we would be seeing it dominate kaggle in all problem domains. Maybe the small data problems belong to other algorithm (such as tree base and forest, SVM).disclaimer - I'm bias for tree base algorithm in medium and small data since it is my thesis.

When not to use deep learning

6 comments

When not to use deep learning

6 comments