This is a nice overview, but needs some editing as I guess this is not the author's first language (but still written better than I could in my second language). The meat of it is there, but I think it needs a once-over from either a native speaker or someone more fluent.<p>I think there's some content still to be added here:<p>> Then we remove some layers. then further remove somethings, but now the results are too pool. more explains, and the results table.<p>> After compiled library for smart platform, the last thing is call C-API in the target language (Jave/Swift).<p>"After the library is compiled for the smart platform, the last thing to do is call the C API from the target language (Java/Swift)".<p>On the actual topic, I really like the ability now to take powerful recognition frameworks and put them onto "small" devices. Great work.
I saw a demo of a deep convnet (I assume) running on a tablet at a conference recently. While it's limited to the vocabulary that the net is trained on, it's seriously impressive seeing this stuff work in realtime.<p>One of the other speakers was giving a tech demo using his webcam. He was looking around for a mug to demonstrate that the classification was good and could work quickly. In the meantime the camera was looking behind him on stage and correctly classified the image as "theatre curtains". It was particularly cool because image processing results are often cherrypicked to show optimal performance and you learn to be skeptical.
Link to the actual 30k line C++ file:
<a href="https://raw.githubusercontent.com/jdeng/gomxnet/master/mxnet.cc" rel="nofollow">https://raw.githubusercontent.com/jdeng/gomxnet/master/mxnet...</a><p>Compiles with "clang++ -lblas mxnet.cc -std=c++11", so I'm not disappointed :)