TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

119 pointsby asparaguialmost 6 years ago

7 comments

antogninialmost 6 years ago
This is a pretty cool result, especially for those of us who are building models for mobile&#x2F;embedded systems!<p>As a quick summary, the problem the authors are trying to solve is this: suppose that you have a convolutional neural network that performs some task and uses X amount of computation. Now you have access to, say, 2X computation --- how should you change your model to best use the extra computation?<p>Generally people have taken advantage of the extra computation by widening the layers, using more layers (but of the same width), or increasing the resolution of the image. In this paper the authors show that if you do any of these individually, the performance of the NN saturates, but if you do all of them at the same time, you can achieve much higher accuracy.<p>Specifically what they do is conduct a small grid search in the vicinity of the original NN and vary the width, depth, and resolution to figure out the best combination. Then they just use those scalings to scale up to the required compute. This seems to work well across a variety of different tasks.<p>The main gripe I had with the paper was that they didn&#x27;t do another grid search around the scaled up NN to verify that that the scaling actually held. In practice it seems to produce pretty efficient NNs, but maybe the scaling doesn&#x27;t extrapolate perfectly and you can get an even better NN by applying some correction.
评论 #20055054 未加载
homarpalmost 6 years ago
Associated google AI blog: <a href="https:&#x2F;&#x2F;ai.googleblog.com&#x2F;2019&#x2F;05&#x2F;efficientnet-improving-accuracy-and.html" rel="nofollow">https:&#x2F;&#x2F;ai.googleblog.com&#x2F;2019&#x2F;05&#x2F;efficientnet-improving-acc...</a><p>and the source code: <a href="https:&#x2F;&#x2F;github.com&#x2F;tensorflow&#x2F;tpu&#x2F;tree&#x2F;master&#x2F;models&#x2F;official&#x2F;efficientnet" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tensorflow&#x2F;tpu&#x2F;tree&#x2F;master&#x2F;models&#x2F;officia...</a>
gwernalmost 6 years ago
<a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;bumjdc&#x2F;r_efficientnet_rethinking_model_scaling_for&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;bumjdc&#x2F;r_e...</a>
_coveredInBeesalmost 6 years ago
The results in this paper are quite astounding. Specifically the improvement in efficiency and accuracy of their EfficientNet architecture over state-of-the-art feature extraction backbones is simply amazing (see Fig 1. from the paper - <a href="https:&#x2F;&#x2F;raw.githubusercontent.com&#x2F;tensorflow&#x2F;tpu&#x2F;master&#x2F;models&#x2F;official&#x2F;efficientnet&#x2F;g3doc&#x2F;params.png" rel="nofollow">https:&#x2F;&#x2F;raw.githubusercontent.com&#x2F;tensorflow&#x2F;tpu&#x2F;master&#x2F;mode...</a>). AFAICT, this is a huge leap in improvement, and what&#x27;s fascinating is how fundamentally simple the entire premise of this paper and the type of experiments performed was.<p>Pretty much everyone will want to switch their feature extractors to some flavor of a pre-trained EfficientNet for any image classification &#x2F; object detection type application going forward. I&#x27;m also excited to see the improvements in speed and accuracy that this can enable for mobile&#x2F;embedded systems.
mkageniusalmost 6 years ago
Can this now be combined with pruning the trained network to even further the size reduction by 5x? [1]<p>1. <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1611.06440" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1611.06440</a> - Pruning Convolutional Neural Networks for Resource Efficient Inference
CShortenalmost 6 years ago
Made a video which attempts to explain the idea of this paper! <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=3svIm5UC94I&amp;t=6s" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=3svIm5UC94I&amp;t=6s</a>
sdanalmost 6 years ago
How is this different from hyperparameter tuning?
评论 #20053716 未加载
评论 #20057084 未加载