One of the most incredible parts is that they've already run feature detection on all 100M images/videos and extracted 50TB of:<p>"SIFT, GIST, Auto Color Correlogram, Gabor Features, CEDD, Color Layout, Edge Histogram, FCTH, Fuzzy Opponent Histogram, Joint Histogram, Kaldi Features, MFCC, SACC_Pitch, and Tonality"<p>The good part about this for researchers is not only that this saves dozens of CPU-years of computation (back of the envelope, it would take 15 years for my laptop to extract those SIFT features alone), but that any differences in learning/recognition performance on the dataset can be attributed to the algorithms in question, uncomplicated by which researcher engineered the best features for the dataset. On the other hand, it's a challenging dataset to work with because you can't just download it and process it locally as has been traditionally done. I'll be interested to see how many take advantage of it.