This is a pretty cool result, especially for those of us who are building models for mobile/embedded systems!<p>As a quick summary, the problem the authors are trying to solve is this: suppose that you have a convolutional neural network that performs some task and uses X amount of computation. Now you have access to, say, 2X computation --- how should you change your model to best use the extra computation?<p>Generally people have taken advantage of the extra computation by widening the layers, using more layers (but of the same width), or increasing the resolution of the image. In this paper the authors show that if you do any of these individually, the performance of the NN saturates, but if you do all of them at the same time, you can achieve much higher accuracy.<p>Specifically what they do is conduct a small grid search in the vicinity of the original NN and vary the width, depth, and resolution to figure out the best combination. Then they just use those scalings to scale up to the required compute. This seems to work well across a variety of different tasks.<p>The main gripe I had with the paper was that they didn't do another grid search around the scaled up NN to verify that that the scaling actually held. In practice it seems to produce pretty efficient NNs, but maybe the scaling doesn't extrapolate perfectly and you can get an even better NN by applying some correction.