This reminds me of how my visual psychology professor was attempting to help those with poor vision 15 years ago, but didn't appear to get anywhere with at the time.<p>The idea was a simple (but clever) one - use virtual reality to segment the world into solid blocks of identified objects. The solid blocks are identifiable to those with poor vision in a way that the real world is not.<p>Essentially this meant processing an image, identifying items e.g. cars, fences, roads etc and then colouring them solid. So instead of a confusing scene of blur, you have a blurred but still identifiable scene of a solid strip of grey for the road, a solid blob of red for the car, another solid yellow stip for a fence etc. A poorly sighted person could still identify from this something that made sense in a way that they couldn't in the real world.<p>What was required was an input, real time visual processing and then display back to the user - all of which was fantasy 15 year ago.<p>However, attempt this today with a visual feed, real time processing like this, and then near instantaneous display of the results back to the person with e.g. google glass, and you might have a viable way to show the world categorised in a visual way that will help those with poor vision. Interesting times.
Looks like this is using training data from the PASCAL VOC object detection
challenge [1], which is the standard benchmark for evaluating object detection
performance in computer vision.<p>Object detection is an extremely tough problem (some would say it is <i>the</i> computer vision
problem ;-)), and while we've made a lot of progress in the past decade, the best
methods are still terrible [2] -- average detection precision between 30-50%.
For reference, most consumer applications require an AP of 90+% to be considered
usable.<p>So if this is a completely automated solution, it's not going to be able to do
much better, unless the creators can make <i>massive</i> (I mean orders-of-magnitude)
improvements on the state-of-the-art.<p>But that being said, there are some applications where lower performance is
acceptable. And if you add some manual verification, you could conceivably
make this much better (with an increase in latency, though). Another possibility
is to specialize on a certain type of input image (e.g., if you're a company
taking photos in your warehouse, where all your photos look very similar and/or
you can control the lighting and environment).<p>Still, I'm excited to see companies attempting to take object detection out to
the real world. All the best to these guys!<p>[1] <a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/" rel="nofollow">http://pascallin.ecs.soton.ac.uk/challenges/VOC/</a><p>[2] <a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/results/index.html" rel="nofollow">http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/resu...</a>
This is not Pedro Felzenszwalb's discriminative part-model algorithm. This is simple AdaBoost. The authors have labeled a bunch of datasets (1000s of them) and are able to detect whatever object.
AdaBoost (Viola/Jones) is the most popular Yes/No detector, there is an OpenCV api for it. It used for detecting faces and license plates in commercial applications.
Full person detector is nothing but a SVM+HOG descriptor.<p>As a computer vision researcher, I am not impressed by this. It is primarily an api for smartphone app makers who want a binary result for detection. It does not help with scene context analysis. For instance, if I have a big picture of a airplane on a wall, it will detect the airplane.. Does it know that this airplane is in the sky? or on a wall?
There are a 1000 failure cases.
Failed completely for me across a half dozen tries. I wonder how cheaply you could get results via Mechanical Turk. I bet you could get much more accurate results for a very low price but with some added latency.
Hey everybody, OP here. Thanks for the great feedback! We're really happy that so many people have checked this out.<p>One thing that I want to mention: our service was built favoring Precision over Recall; we reasoned that we'd rather have a low number of false positives and make sure that when we do report a detection, that it actually is one. Thus, our service may occasionally miss instances.<p>I'm going to implement a button on the Experiment page that lets you flag a detection as something that we need to work on; we will use your feedback to improve the accuracy.
<a href="http://www.dauntless-soft.com/products/freebies/airbus380/a380_5.jpg" rel="nofollow">http://www.dauntless-soft.com/products/freebies/airbus380/a3...</a> detected 0 planes, there should be at least 5<p>when used <a href="http://www.airbus.com/fileadmin/media_gallery/aircraft_pages_photo_galleries/a380-gallery/A380_On_Ground.JPG" rel="nofollow">http://www.airbus.com/fileadmin/media_gallery/aircraft_pages...</a> it detected 2 planes, there was 1 only<p>but hope with additional training images, it would improve.
As a long time CV enthusiast, I applaude the tech and the way you guys make it "just work". However for any serious application I feel a few things are missing:<p>- your pricing won't work for video (even at only 5fps)<p>- I can't really use the data without a confidence level of detection. Because for some applications I'd rather discard a bouding box that is below a threshold I set.<p>Other than that, congrats for the great work :)
Over a dozen experiments, the recognition rate for faces seems to be about 70%. Example of failure: only 2 faces detected here (in particular NOT the one in focus) <a href="http://iamdaveknockles.files.wordpress.com/2011/03/meeting_jpg.jpg" rel="nofollow">http://iamdaveknockles.files.wordpress.com/2011/03/meeting_j...</a><p>This is worse than OpenCV (I thought you were using OpenCV but apparently aren't?)
Tried detecting Airplanes on this image with 18 airplanes, but it only detected 4 of them.<p><a href="https://lh3.ggpht.com/-GbPgbhUtmnE/UH0p3VmMWoI/AAAAAAAAApM/uuP6VHzyZ44/s1600/all-planes_800.jpg" rel="nofollow">https://lh3.ggpht.com/-GbPgbhUtmnE/UH0p3VmMWoI/AAAAAAAAApM/u...</a>
Tried to detect Cats in a whole room full of cats, and it detected zero cats.<p><a href="http://englishrussia.com/wp-content/uploads/2007/08/130-cats-1.jpg" rel="nofollow">http://englishrussia.com/wp-content/uploads/2007/08/130-cats...</a>
Wow, I just tried this image of faces:<p><a href="http://3rdarm.biz/images/2010/02/faces.jpg" rel="nofollow">http://3rdarm.biz/images/2010/02/faces.jpg</a><p>It got almost all of them but so many errors. It can't detect sheeps either.<p>I was really impressed at first, but as I tried out more and more images, it became apparent that the api isn't mature enough for one or two cents worth of money. There is a 90% of the algorithm detect the image correctly, but sometimes it doesn't detect the entire object. For example, I used another image of two jets, but it only found one of them even though the jets were identical, but one was smaller than the other.
Concerning the API: (On page <a href="https://www.dextrorobotics.com/api" rel="nofollow">https://www.dextrorobotics.com/api</a>)<p>* The documentation is pretty weak.<p>* I am not sure what a classID is, and I don't see any links to where the numbers come from.<p>* The example request is posting to an insecure http address, but the secret api key is required?<p>* The example request doesn't fit on one line? It took me a while to see it was in the "GET / HTTP/1.1" style.<p>* How do errors work? Having clearly specified error responses would be really useful.<p>If you trying to sell me on your API, show it to me.
Did a few tests and it works pretty well! No false positives at least.<p>Any plans to increase the number of objects you can search for at once? Very interested in using this but I'd want to be able to scan for ~20 objects.
Interesting technology. It got a couple correct for me. But failed on a bunch as well. Here's a few horses it failed to find correctly.<p>2 horses / detected 0: <a href="http://images4.fanpop.com/image/photos/23500000/horse-horses-23582505-1024-768.jpg" rel="nofollow">http://images4.fanpop.com/image/photos/23500000/horse-horses...</a><p>4 horses / detected all as 1: <a href="http://4.bp.blogspot.com/-Rso9vw4BmSE/TqZU6vHl3kI/AAAAAAAACLk/NBBmDDLC9uY/s1600/slaughter%2Bof%2Bhorses%2Be%2BOctober%2B25%2B2011%2B3.jpg" rel="nofollow">http://4.bp.blogspot.com/-Rso9vw4BmSE/TqZU6vHl3kI/AAAAAAAACL...</a>
It didn't recognise the cat in this picture (<a href="http://i.imgur.com/TgFaJh.jpg" rel="nofollow">http://i.imgur.com/TgFaJh.jpg</a>) so I'm doubtful of it's practicality.
Very interesting application though, but I couldn't realize real life usage via web api. As my knowledge those kind of stuff is for realtime applications and with web based approach it might not serve the purpose.<p>BTW, it can find only two airplanes in this photo
<a href="http://www.q8.com/SiteCollectionImages/Gatwick%20Airport.jpg" rel="nofollow">http://www.q8.com/SiteCollectionImages/Gatwick%20Airport.jpg</a>
I have been tinkering with a similar side project which you can read about here:<p><a href="http://artificial-intelligence-projects.com/augmented-reality/" rel="nofollow">http://artificial-intelligence-projects.com/augmented-realit...</a><p>It's still in the development stage because I can only fiddle with it when I have the time and impetus to do so. Criticisms/comments welcome.
Works well! Found a few it didn't work on. For example, it didn't detect an airplane in this image (but it's a fighter jet, so maybe not part of the training set): <a href="http://cdn-www.airliners.net/aviation-photos/photos/2/8/0/2043082.jpg" rel="nofollow">http://cdn-www.airliners.net/aviation-photos/photos/2/8/0/20...</a>
Hmm.. It's a fantastic idea and really great website, but the actual algorithm is very unprecise.<p>See this:
<a href="http://i.imgur.com/ulith.png?1" rel="nofollow">http://i.imgur.com/ulith.png?1</a><p>You need to get a higher percentage of actual matches before you can use this for anything.
Didn't work for me. That said, image recognition via an API will be huge once things mature a little more.<p>I've been searching lately for a post-face.com API and have been following a few for a while, but they seem to have similar issues with poor results.
It'd be great if you could use this to detect nudity. Any plans for that? I'm assuming the balls on the "in the works" list are of the sport variety? ;)<p>In the works:
Shoes
Balls
Smartphones and tablets
Dogs
Keyboards
Cups and glasses
Doors
Keys
shamelessly plug: libccv supports REST-ful API in 0.4 version, it is open-source, and free: <a href="http://libccv.org/doc/doc-http/" rel="nofollow">http://libccv.org/doc/doc-http/</a>. Trained pedestrian / car / face detectors are included.
<a href="http://rumors.automobilemag.com/files/2012/11/2013-VW-Eos-front-three-quarter-with-plane.jpg" rel="nofollow">http://rumors.automobilemag.com/files/2012/11/2013-VW-Eos-fr...</a><p>detected 3 planes... there is only 1 plane and a car
Failed with this :(<p><a href="http://i.dailymail.co.uk/i/pix/2012/11/06/article-2228752-15E0DC91000005DC-176_634x286.jpg" rel="nofollow">http://i.dailymail.co.uk/i/pix/2012/11/06/article-2228752-15...</a>
It didn't detect face here:
<a href="http://i.imgur.com/c34dX.jpg" rel="nofollow">http://i.imgur.com/c34dX.jpg</a>
The algorithm probably got distracted and raised an exception.
Is there a good way for submitting recommendations for improvements? <a href="http://i.imgur.com/yLSHW.jpg" rel="nofollow">http://i.imgur.com/yLSHW.jpg</a>
nice!<p>does a good job with painting too, but it did find the phantom neighbor peeping in as well:<p><a href="http://img822.imageshack.us/img822/347/screenshot20130111at524.png" rel="nofollow">http://img822.imageshack.us/img822/347/screenshot20130111at5...</a>
detected 2 planes, there are 7 <a href="http://iskin.co.uk/wallpapers/imagecache/1280x800/jet_plane_formation.jpg" rel="nofollow">http://iskin.co.uk/wallpapers/imagecache/1280x800/jet_plane_...</a><p>seems too buggy to pay just yet
seems pretty good, but my first test found a potted plant in the aeroplane demo picture -- a 100 story potted plant :P very cool idea, super hard problem so mad respect regardless!