First, my earlier comments on a "competing" approach from Facebook may help give relevant context for how to think about these numbers: <a href="https://news.ycombinator.com/item?id=7393378" rel="nofollow">https://news.ycombinator.com/item?id=7393378</a><p>Briefly skimming through this paper, it appears that these numbers are not a fair comparison, as this paper uses the <i>un</i>restricted protocol of LFW[1], whereas the other methods in the ROC curve shown in the paper are using the restricted protocol. As you might imagine, the latter is more restrictive -- specifically in terms of amount of training data allowed. And as I mentioned in my previous comment, training data is king in these kind of systems -- more is always better.<p>To go slightly out on a limb, I think more significant than the new theoretical model proposed in this paper is probably the use of lots of different types of datasets for training. (Significantly more data >> more complicated models, most of the time.) But I'd have to read the paper much more carefully to be sure about this.<p>[1] <a href="http://vis-www.cs.umass.edu/lfw/results.html" rel="nofollow">http://vis-www.cs.umass.edu/lfw/results.html</a>
The original url [1] was blogspam—that is, it was a knock-off (or excerpt) of some other, more original source. In such cases HN strongly prefers the original source.<p>Submitters: blogspam is usually easy to recognize. Please check for that and post the original instead.<p>1. <a href="http://news.sciencemag.org/signal-noise/2014/04/face-recognition-algorithm-finally-beats-humans" rel="nofollow">http://news.sciencemag.org/signal-noise/2014/04/face-recogni...</a>
The actual paper (parts are accessible & interesting): <a href="http://arxiv.org/pdf/1404.3840v1.pdf" rel="nofollow">http://arxiv.org/pdf/1404.3840v1.pdf</a>
Face recognition is one of those technologies that's seems neat at a glance and mindbogglingly terrifying on closer inspection. It has the potential to sci-fi the world overnight and it could do it tomorrow night. The algorithm accuracy and enormous comparison DBs are already here.<p>The effect this can have on commerce, advertising, policing, crime, culture, or a bunch of other things has enough wide reaching effects for a sci fi thriller.<p>A camera in cahoots with a till in a supermarket could put a face and a name on every purchase. If the camera and the till in cahoots with an advertising billboard in a shopping mall, you have created an offline version of conversion tracking.<p>Since the supermarket and billboard company are in cahoots, they can compare notes and find a billboard location that gets the supermarket's best customers. If you are seen checking out climbing gear by a camera in cahoots with Facebook, that store can keep outdoor activity products to you in Facebook. Hello offline retargeting.<p>That's just advertising. Imagine policing. Imagine high school.
I'm a computer vision grad student. A few things concern me about this work. Maybe they're incidental, but I'm not ready to throw my hands up in the air quite yet.<p>- Why wasn't this accepted to CVPR/ECCV/one of the well-established computer vision conferences? I would love to read some of the reviewers' comments about this work before I give further judgment. (If this really is some CVPR preprint, or if it actually is peer-reviewed, I'd feel much better about this.)<p>- Why isn't this work listed on the official curated "LFW Results" page that Erik Learned-Miller maintains? <a href="http://vis-www.cs.umass.edu/lfw/results.html" rel="nofollow">http://vis-www.cs.umass.edu/lfw/results.html</a> Is this work so new that Erik hasn't had time to review it yet?<p>- Human performance on LFW is 99.2%, which is higher than what the authors think it is. The performance drops to the (claimed) 97% when we only show humans a tight crop of the face: <a href="http://www1.cs.columbia.edu/CAVE/publications/pdfs/Kumar_ICCV09.pdf" rel="nofollow">http://www1.cs.columbia.edu/CAVE/publications/pdfs/Kumar_ICC...</a> They discuss this difference in a paragraph in their conclusion, but I consider it dishonest to use the lower number in the abstract and imply it in the title. In fact, I consider it misleading to put "Surpassing human performance" in the title to begin with, but that's another matter :)<p>- Showing good performance on one dataset (LFW) is certainly not enough to show that this "outperforms humans" in the general case. Getting a state-of-the-art result on LFW these days is like squeezing a drop of water out of a rock; in my opinion, we should turn our attention to harder datasets like GBU now that these "easier" ones are solved.<p>I'm not terribly familiar with Gaussian processes so I'm not sure whether the math works out, but it is a pretty uncommon thing to try in this domain. (Perhaps that's what makes this work interesting, especially since this year seems to be the "Deep Learning is Eating Everyone's Lunch" year)<p>I also wish they describe what final-stage classifier they use for the "GaussianFace as Feature Extractor" model. Often, that's the most important step; it's strange that they didn't compare with POOF/High-dimensional-LBP/Face++'s deep-learned features/any of the other state-of-the-art feature extractors, especially considering how much worse "GaussianFace as a binary classifier" does (93% vs 97% is a huge difference in this dataset)<p>Just my two cents. It definitely demands further exploration. I don't see any obvious mistakes, but I'm not sure why their approach works as well as they claim it does either.<p><i>Edit</i>: I don't mean to start a witch hunt or anything, but if the authors have the guts to put "Human-level performance" in their title, they're just <i>begging</i> for the community to inspect every detail and point out all the flaws in every minutiae in their work. It's our community's hot button. It's similar to the old adage about how if you want a Linux user to help you, you have to tell them how much Linux sucks. That's where much of my skepticism comes from. The most astounding papers are often the most humble, but "humble" certainly doesn't describe this work.
I'll have to run this by my friend who writes morphometrics algorithms, as I can;t actally tell what is new about this paper. This might actually allow for a proper photo-matching search engine. All the ones that I have tried to this point have been lacking or broken...