TechEcho

7 comments

antogninialmost 6 years ago

This is a pretty cool result, especially for those of us who are building models for mobile/embedded systems!As a quick summary, the problem the authors are trying to solve is this: suppose that you have a convolutional neural network that performs some task and uses X amount of computation. Now you have access to, say, 2X computation --- how should you change your model to best use the extra computation?Generally people have taken advantage of the extra computation by widening the layers, using more layers (but of the same width), or increasing the resolution of the image. In this paper the authors show that if you do any of these individually, the performance of the NN saturates, but if you do all of them at the same time, you can achieve much higher accuracy.Specifically what they do is conduct a small grid search in the vicinity of the original NN and vary the width, depth, and resolution to figure out the best combination. Then they just use those scalings to scale up to the required compute. This seems to work well across a variety of different tasks.The main gripe I had with the paper was that they didn't do another grid search around the scaled up NN to verify that that the scaling actually held. In practice it seems to produce pretty efficient NNs, but maybe the scaling doesn't extrapolate perfectly and you can get an even better NN by applying some correction.

评论 #20055054 未加载

homarpalmost 6 years ago

Associated google AI blog: <a href="https://ai.googleblog.com/2019/05/efficientnet-improving-accuracy-and.html" rel="nofollow">https://ai.googleblog.com/2019/05/efficientnet-improving-acc...</a>and the source code: <a href="https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet" rel="nofollow">https://github.com/tensorflow/tpu/tree/master/models/officia...</a>

gwernalmost 6 years ago

<a href="https://www.reddit.com/r/MachineLearning/comments/bumjdc/r_efficientnet_rethinking_model_scaling_for/" rel="nofollow">https://www.reddit.com/r/MachineLearning/comments/bumjdc/r_e...</a>

_coveredInBeesalmost 6 years ago

The results in this paper are quite astounding. Specifically the improvement in efficiency and accuracy of their EfficientNet architecture over state-of-the-art feature extraction backbones is simply amazing (see Fig 1. from the paper - <a href="https://raw.githubusercontent.com/tensorflow/tpu/master/models/official/efficientnet/g3doc/params.png" rel="nofollow">https://raw.githubusercontent.com/tensorflow/tpu/master/mode...</a>). AFAICT, this is a huge leap in improvement, and what's fascinating is how fundamentally simple the entire premise of this paper and the type of experiments performed was.Pretty much everyone will want to switch their feature extractors to some flavor of a pre-trained EfficientNet for any image classification / object detection type application going forward. I'm also excited to see the improvements in speed and accuracy that this can enable for mobile/embedded systems.

mkageniusalmost 6 years ago

Can this now be combined with pruning the trained network to even further the size reduction by 5x? [1]1. <a href="https://arxiv.org/abs/1611.06440" rel="nofollow">https://arxiv.org/abs/1611.06440</a> - Pruning Convolutional Neural Networks for Resource Efficient Inference

CShortenalmost 6 years ago

Made a video which attempts to explain the idea of this paper! <a href="https://www.youtube.com/watch?v=3svIm5UC94I&t=6s" rel="nofollow">https://www.youtube.com/watch?v=3svIm5UC94I&t=6s</a>

sdanalmost 6 years ago

How is this different from hyperparameter tuning?

评论 #20053716 未加载

评论 #20057084 未加载

7 comments

antogninialmost 6 years ago

评论 #20055054 未加载

homarpalmost 6 years ago

gwernalmost 6 years ago

<a href="https://www.reddit.com/r/MachineLearning/comments/bumjdc/r_efficientnet_rethinking_model_scaling_for/" rel="nofollow">https://www.reddit.com/r/MachineLearning/comments/bumjdc/r_e...</a>

_coveredInBeesalmost 6 years ago

mkageniusalmost 6 years ago

CShortenalmost 6 years ago

Made a video which attempts to explain the idea of this paper! <a href="https://www.youtube.com/watch?v=3svIm5UC94I&t=6s" rel="nofollow">https://www.youtube.com/watch?v=3svIm5UC94I&t=6s</a>

sdanalmost 6 years ago

How is this different from hyperparameter tuning?

评论 #20053716 未加载

评论 #20057084 未加载

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

7 comments

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

7 comments