The cold hard reality of machine learning is that most useful data isn't readily available to just be collected. Semi-supervised and weakly supervised learning, data augmentation, multi-task learning, these are the things that will enable machine learning for the majority of companies out there who need to build datasets and potentially leverage domain expertise somehow to bootstrap intelligent features in their apps. This is great work in that direction for computer vision.<p>Even the giants are recognizing this fact and are leveraging it to great effect. Some keywords to search for good papers and projects: Overton, Snorkel, Snorkel Metal
Great summary! Reminds me a lot about Leon Bottou's work on using deep learning to learn causal invariant representations. (Video: <a href="https://www.youtube.com/watch?v=lbZNQt0Q5HA" rel="nofollow">https://www.youtube.com/watch?v=lbZNQt0Q5HA</a>)<p>We can view the augmentations of the image as "interventions" forcing the model to learn an invariant representation of the image.<p>Although the blog post did not frame it as this type of problem (not sure if the paper did), I think it can definitely be seen as such and is really promising.
I wish all papers were structured this way, by default.<p>That is, plenty of good diagrams, clear explanations and intuitions, no unnecessary mathiness.
I wonder if a two step process could work better than this, first use a variational autoencoder or simple an autoencoder then use it to train the labeled sampled.<p>In (1) there is a full example of using the two step strategy but using more labeled data to obtain 92% of accuracy. Someone can try changing the second part to use only ten labels for the classifying part and share results?<p>(1) <a href="https://www.datacamp.com/community/tutorials/autoencoder-classifier-python" rel="nofollow">https://www.datacamp.com/community/tutorials/autoencoder-cla...</a><p>Edited: I found a deep analysis in (2), in short for CIFAR 10 the VAE semi-supervised learning approach provides poor results, but the author has not used augmentation!<p>(2) <a href="http://bjlkeng.github.io/posts/semi-supervised-learning-with-variational-autoencoders/" rel="nofollow">http://bjlkeng.github.io/posts/semi-supervised-learning-with...</a>
I wish there was a way to augment data as easily for free text, and other business data. I always see these few-shot learning papers for images, I suspect because it's easy to augment image datasets and because image-recognition is interesting to laypeople. The vast majority of data we deal with in business is text/numerical which is much harder to use in these approaches.
I don't know much about ML/Deep-Learning and I have a burning question:<p>Say we have 10 images as a starting point. Then we create 10,000 images from those 10 images by adding noise, filters, flip them, skew them, distort them, etc. Isn't the underlying data the same (or some formal definition of shannon information entropy)? Would that actually improve neural networks?<p>I've always wondered. Is it possible to generate infinite data and get almost perfect neural network accuracy?
I had a read through this and I couldn't really tell if there was something novel here?<p>I understand that perturbations and generating new examples from labelled examples is a pretty normal park of the process when you only have a limited number of examples available.
It is not the same thing but kind of reminds of my naive and obvious(meaning something that came up when drinking beer) idea of generating bunch of variations of your labeled data in cases when you do not have enough.<p>Let's say you only have one image of dog, you generate bunch of color variations, sharpness adjustments, flips, transforms, etc.
Voila you have 256 images of the same dog.<p>EDIT: I noticed that this is definitely a common idea as others have already pointed out.
I am not sure how this article got ranked so high. I am suspicious about reading these article written by non experts. I would prefer to go to authentic sources and read the original paper. Most of the time information in these articles are misleading and wrong.
Title is (slightly) wrong.<p>As the first paragraph says:
"In this post, we will understand the concept of FixMatch and also see it got 78% accuracy on CIFAR-10 with just 10 images."<p>Reporting the <i>best</i> performance on a method that deliberately uses just a small subset of the data is shady as heck.