Show HN: API for detecting people, cars, and everyday objects in images

179 pointsby jluanover 12 years ago

40 comments

simonsquiffover 12 years ago

This reminds me of how my visual psychology professor was attempting to help those with poor vision 15 years ago, but didn't appear to get anywhere with at the time.The idea was a simple (but clever) one - use virtual reality to segment the world into solid blocks of identified objects. The solid blocks are identifiable to those with poor vision in a way that the real world is not.Essentially this meant processing an image, identifying items e.g. cars, fences, roads etc and then colouring them solid. So instead of a confusing scene of blur, you have a blurred but still identifiable scene of a solid strip of grey for the road, a solid blob of red for the car, another solid yellow stip for a fence etc. A poorly sighted person could still identify from this something that made sense in a way that they couldn't in the real world.What was required was an input, real time visual processing and then display back to the user - all of which was fantasy 15 year ago.However, attempt this today with a visual feed, real time processing like this, and then near instantaneous display of the results back to the person with e.g. google glass, and you might have a viable way to show the world categorised in a visual way that will help those with poor vision. Interesting times.

评论 #5046031 未加载

apuover 12 years ago

Looks like this is using training data from the PASCAL VOC object detection challenge [1], which is the standard benchmark for evaluating object detection performance in computer vision.Object detection is an extremely tough problem (some would say it is the computer vision problem ;-)), and while we've made a lot of progress in the past decade, the best methods are still terrible [2] -- average detection precision between 30-50%. For reference, most consumer applications require an AP of 90+% to be considered usable.So if this is a completely automated solution, it's not going to be able to do much better, unless the creators can make massive (I mean orders-of-magnitude) improvements on the state-of-the-art.But that being said, there are some applications where lower performance is acceptable. And if you add some manual verification, you could conceivably make this much better (with an increase in latency, though). Another possibility is to specialize on a certain type of input image (e.g., if you're a company taking photos in your warehouse, where all your photos look very similar and/or you can control the lighting and environment).Still, I'm excited to see companies attempting to take object detection out to the real world. All the best to these guys![1] <a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/" rel="nofollow">http://pascallin.ecs.soton.ac.uk/challenges/VOC/</a>[2] <a href="http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/results/index.html" rel="nofollow">http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/resu...</a>

评论 #5049026 未加载

评论 #5046773 未加载

ank286over 12 years ago

This is not Pedro Felzenszwalb's discriminative part-model algorithm. This is simple AdaBoost. The authors have labeled a bunch of datasets (1000s of them) and are able to detect whatever object. AdaBoost (Viola/Jones) is the most popular Yes/No detector, there is an OpenCV api for it. It used for detecting faces and license plates in commercial applications. Full person detector is nothing but a SVM+HOG descriptor.As a computer vision researcher, I am not impressed by this. It is primarily an api for smartphone app makers who want a binary result for detection. It does not help with scene context analysis. For instance, if I have a big picture of a airplane on a wall, it will detect the airplane.. Does it know that this airplane is in the sky? or on a wall? There are a 1000 failure cases.

评论 #5046655 未加载

评论 #5046433 未加载

forrestthewoodsover 12 years ago

Failed completely for me across a half dozen tries. I wonder how cheaply you could get results via Mechanical Turk. I bet you could get much more accurate results for a very low price but with some added latency.

评论 #5045910 未加载

评论 #5046695 未加载

评论 #5047138 未加载

jluanover 12 years ago

Hey everybody, OP here. Thanks for the great feedback! We're really happy that so many people have checked this out.One thing that I want to mention: our service was built favoring Precision over Recall; we reasoned that we'd rather have a low number of false positives and make sure that when we do report a detection, that it actually is one. Thus, our service may occasionally miss instances.I'm going to implement a button on the Experiment page that lets you flag a detection as something that we need to work on; we will use your feedback to improve the accuracy.

评论 #5046020 未加载

评论 #5055243 未加载

bluishgreenover 12 years ago

Can you give me bit more technical background. Tell me how this is better than for eg. out of the box openCV filters.

评论 #5046023 未加载

senthilnayagamover 12 years ago

<a href="http://www.dauntless-soft.com/products/freebies/airbus380/a380_5.jpg" rel="nofollow">http://www.dauntless-soft.com/products/freebies/airbus380/a3...</a> detected 0 planes, there should be at least 5when used <a href="http://www.airbus.com/fileadmin/media_gallery/aircraft_pages_photo_galleries/a380-gallery/A380_On_Ground.JPG" rel="nofollow">http://www.airbus.com/fileadmin/media_gallery/aircraft_pages...</a> it detected 2 planes, there was 1 onlybut hope with additional training images, it would improve.

steeveover 12 years ago

As a long time CV enthusiast, I applaude the tech and the way you guys make it "just work". However for any serious application I feel a few things are missing:- your pricing won't work for video (even at only 5fps)- I can't really use the data without a confidence level of detection. Because for some applications I'd rather discard a bouding box that is below a threshold I set.Other than that, congrats for the great work :)

评论 #5045885 未加载

fcholletover 12 years ago

Over a dozen experiments, the recognition rate for faces seems to be about 70%. Example of failure: only 2 faces detected here (in particular NOT the one in focus) <a href="http://iamdaveknockles.files.wordpress.com/2011/03/meeting_jpg.jpg" rel="nofollow">http://iamdaveknockles.files.wordpress.com/2011/03/meeting_j...</a>This is worse than OpenCV (I thought you were using OpenCV but apparently aren't?)

评论 #5046711 未加载

limejuiceover 12 years ago

Tried detecting Airplanes on this image with 18 airplanes, but it only detected 4 of them.<a href="https://lh3.ggpht.com/-GbPgbhUtmnE/UH0p3VmMWoI/AAAAAAAAApM/uuP6VHzyZ44/s1600/all-planes_800.jpg" rel="nofollow">https://lh3.ggpht.com/-GbPgbhUtmnE/UH0p3VmMWoI/AAAAAAAAApM/u...</a>

评论 #5045871 未加载

limejuiceover 12 years ago

Tried to detect Cats in a whole room full of cats, and it detected zero cats.<a href="http://englishrussia.com/wp-content/uploads/2007/08/130-cats-1.jpg" rel="nofollow">http://englishrussia.com/wp-content/uploads/2007/08/130-cats...</a>

zopticityover 12 years ago

Wow, I just tried this image of faces:<a href="http://3rdarm.biz/images/2010/02/faces.jpg" rel="nofollow">http://3rdarm.biz/images/2010/02/faces.jpg</a>It got almost all of them but so many errors. It can't detect sheeps either.I was really impressed at first, but as I tried out more and more images, it became apparent that the api isn't mature enough for one or two cents worth of money. There is a 90% of the algorithm detect the image correctly, but sometimes it doesn't detect the entire object. For example, I used another image of two jets, but it only found one of them even though the jets were identical, but one was smaller than the other.

afhofover 12 years ago

Concerning the API: (On page <a href="https://www.dextrorobotics.com/api" rel="nofollow">https://www.dextrorobotics.com/api</a>)* The documentation is pretty weak.* I am not sure what a classID is, and I don't see any links to where the numbers come from.* The example request is posting to an insecure http address, but the secret api key is required?* The example request doesn't fit on one line? It took me a while to see it was in the "GET / HTTP/1.1" style.* How do errors work? Having clearly specified error responses would be really useful.If you trying to sell me on your API, show it to me.

makeeeover 12 years ago

Did a few tests and it works pretty well! No false positives at least.Any plans to increase the number of objects you can search for at once? Very interested in using this but I'd want to be able to scan for ~20 objects.

评论 #5045784 未加载

评论 #5045983 未加载

agottererover 12 years ago

Interesting technology. It got a couple correct for me. But failed on a bunch as well. Here's a few horses it failed to find correctly.2 horses / detected 0: <a href="http://images4.fanpop.com/image/photos/23500000/horse-horses-23582505-1024-768.jpg" rel="nofollow">http://images4.fanpop.com/image/photos/23500000/horse-horses...</a>4 horses / detected all as 1: <a href="http://4.bp.blogspot.com/-Rso9vw4BmSE/TqZU6vHl3kI/AAAAAAAACLk/NBBmDDLC9uY/s1600/slaughter%2Bof%2Bhorses%2Be%2BOctober%2B25%2B2011%2B3.jpg" rel="nofollow">http://4.bp.blogspot.com/-Rso9vw4BmSE/TqZU6vHl3kI/AAAAAAAACL...</a>

cabalamatover 12 years ago

It didn't recognise the cat in this picture (<a href="http://i.imgur.com/TgFaJh.jpg" rel="nofollow">http://i.imgur.com/TgFaJh.jpg</a>) so I'm doubtful of it's practicality.

bbayerover 12 years ago

Very interesting application though, but I couldn't realize real life usage via web api. As my knowledge those kind of stuff is for realtime applications and with web based approach it might not serve the purpose.BTW, it can find only two airplanes in this photo <a href="http://www.q8.com/SiteCollectionImages/Gatwick%20Airport.jpg" rel="nofollow">http://www.q8.com/SiteCollectionImages/Gatwick%20Airport.jpg</a>

philhippusover 12 years ago

I have been tinkering with a similar side project which you can read about here:<a href="http://artificial-intelligence-projects.com/augmented-reality/" rel="nofollow">http://artificial-intelligence-projects.com/augmented-realit...</a>It's still in the development stage because I can only fiddle with it when I have the time and impetus to do so. Criticisms/comments welcome.

gesmanover 12 years ago

Hey - I found a bug - no cat was detected here: <a href="http://i.imgur.com/wGxWy.jpg" rel="nofollow">http://i.imgur.com/wGxWy.jpg</a>

jschmitz28over 12 years ago

It seems to have some trouble finding cats: <a href="http://i.imgur.com/ONFis.jpg" rel="nofollow">http://i.imgur.com/ONFis.jpg</a>

yesimahumanover 12 years ago

Works well! Found a few it didn't work on. For example, it didn't detect an airplane in this image (but it's a fighter jet, so maybe not part of the training set): <a href="http://cdn-www.airliners.net/aviation-photos/photos/2/8/0/2043082.jpg" rel="nofollow">http://cdn-www.airliners.net/aviation-photos/photos/2/8/0/20...</a>

stormenover 12 years ago

Hmm.. It's a fantastic idea and really great website, but the actual algorithm is very unprecise.See this: <a href="http://i.imgur.com/ulith.png?1" rel="nofollow">http://i.imgur.com/ulith.png?1</a>You need to get a higher percentage of actual matches before you can use this for anything.

treelovinhippieover 12 years ago

Didn't work for me. That said, image recognition via an API will be huge once things mature a little more.I've been searching lately for a post-face.com API and have been following a few for a while, but they seem to have similar issues with poor results.

ahcover 12 years ago

It'd be great if you could use this to detect nudity. Any plans for that? I'm assuming the balls on the "in the works" list are of the sport variety? ;)In the works: Shoes Balls Smartphones and tablets Dogs Keyboards Cups and glasses Doors Keys

评论 #5046150 未加载

liuliuover 12 years ago

shamelessly plug: libccv supports REST-ful API in 0.4 version, it is open-source, and free: <a href="http://libccv.org/doc/doc-http/" rel="nofollow">http://libccv.org/doc/doc-http/</a>. Trained pedestrian / car / face detectors are included.

mephi5t0over 12 years ago

<a href="http://rumors.automobilemag.com/files/2012/11/2013-VW-Eos-front-three-quarter-with-plane.jpg" rel="nofollow">http://rumors.automobilemag.com/files/2012/11/2013-VW-Eos-fr...</a>detected 3 planes... there is only 1 plane and a car

captaincrunchover 12 years ago

Failed with this :(<a href="http://i.dailymail.co.uk/i/pix/2012/11/06/article-2228752-15E0DC91000005DC-176_634x286.jpg" rel="nofollow">http://i.dailymail.co.uk/i/pix/2012/11/06/article-2228752-15...</a>

IceyECover 12 years ago

That's great! I like that you guys are offering a small free usage tier!

gesmanover 12 years ago

It didn't detect face here: <a href="http://i.imgur.com/c34dX.jpg" rel="nofollow">http://i.imgur.com/c34dX.jpg</a> The algorithm probably got distracted and raised an exception.

评论 #5046135 未加载

评论 #5046039 未加载

piercebotover 12 years ago

Is there a good way for submitting recommendations for improvements? <a href="http://i.imgur.com/yLSHW.jpg" rel="nofollow">http://i.imgur.com/yLSHW.jpg</a>

TommyDANGerousover 12 years ago

That is amazingly awesome. Glad it can integrate with Ruby and Python. I haven't even read the whole info and I already signed up.

MasterScratover 12 years ago

Isn't there a risk for this service to be used as an image proxy? The analyzed images are rehosted on their S3...

blaze33over 12 years ago

Over/underdetection of bicycles: <a href="http://imgur.com/a/tIS6c" rel="nofollow">http://imgur.com/a/tIS6c</a>

gourneauover 12 years ago

Very nice. Would y'all consider offering some embeddable solution that does not need to be ran on the net.

评论 #5045808 未加载

misleading_nameover 12 years ago

nice!does a good job with painting too, but it did find the phantom neighbor peeping in as well:<a href="http://img822.imageshack.us/img822/347/screenshot20130111at524.png" rel="nofollow">http://img822.imageshack.us/img822/347/screenshot20130111at5...</a>

danielharanover 12 years ago

Are you willing to pay for this service but need higher accuracy? I'd love to hear from you.

mephi5t0over 12 years ago

detected 2 planes, there are 7 <a href="http://iskin.co.uk/wallpapers/imagecache/1280x800/jet_plane_formation.jpg" rel="nofollow">http://iskin.co.uk/wallpapers/imagecache/1280x800/jet_plane_...</a>seems too buggy to pay just yet

pkreinover 12 years ago

seems pretty good, but my first test found a potted plant in the aeroplane demo picture -- a 100 story potted plant :P very cool idea, super hard problem so mad respect regardless!

leoplctover 12 years ago

I tried to search for a cow into a horse's image, but it failed

sinzoneover 12 years ago

Hi, would love to have this API on Mashape.com

40 comments

simonsquiffover 12 years ago

评论 #5046031 未加载

apuover 12 years ago

评论 #5049026 未加载

评论 #5046773 未加载

ank286over 12 years ago

评论 #5046655 未加载

评论 #5046433 未加载

forrestthewoodsover 12 years ago

评论 #5045910 未加载

评论 #5046695 未加载

评论 #5047138 未加载

jluanover 12 years ago

评论 #5046020 未加载

评论 #5055243 未加载

bluishgreenover 12 years ago

Can you give me bit more technical background. Tell me how this is better than for eg. out of the box openCV filters.

评论 #5046023 未加载

senthilnayagamover 12 years ago

steeveover 12 years ago

评论 #5045885 未加载

fcholletover 12 years ago

评论 #5046711 未加载

limejuiceover 12 years ago

评论 #5045871 未加载

limejuiceover 12 years ago

zopticityover 12 years ago

afhofover 12 years ago

makeeeover 12 years ago

评论 #5045784 未加载

评论 #5045983 未加载

agottererover 12 years ago

cabalamatover 12 years ago

It didn't recognise the cat in this picture (<a href="http://i.imgur.com/TgFaJh.jpg" rel="nofollow">http://i.imgur.com/TgFaJh.jpg</a>) so I'm doubtful of it's practicality.

bbayerover 12 years ago

philhippusover 12 years ago

gesmanover 12 years ago

Hey - I found a bug - no cat was detected here: <a href="http://i.imgur.com/wGxWy.jpg" rel="nofollow">http://i.imgur.com/wGxWy.jpg</a>

jschmitz28over 12 years ago

It seems to have some trouble finding cats: <a href="http://i.imgur.com/ONFis.jpg" rel="nofollow">http://i.imgur.com/ONFis.jpg</a>

yesimahumanover 12 years ago

stormenover 12 years ago

treelovinhippieover 12 years ago

ahcover 12 years ago

评论 #5046150 未加载

liuliuover 12 years ago

mephi5t0over 12 years ago

captaincrunchover 12 years ago

IceyECover 12 years ago

That's great! I like that you guys are offering a small free usage tier!

gesmanover 12 years ago

It didn't detect face here: <a href="http://i.imgur.com/c34dX.jpg" rel="nofollow">http://i.imgur.com/c34dX.jpg</a> The algorithm probably got distracted and raised an exception.

评论 #5046135 未加载

评论 #5046039 未加载

piercebotover 12 years ago

Is there a good way for submitting recommendations for improvements? <a href="http://i.imgur.com/yLSHW.jpg" rel="nofollow">http://i.imgur.com/yLSHW.jpg</a>

TommyDANGerousover 12 years ago

That is amazingly awesome. Glad it can integrate with Ruby and Python. I haven't even read the whole info and I already signed up.

MasterScratover 12 years ago

Isn't there a risk for this service to be used as an image proxy? The analyzed images are rehosted on their S3...

blaze33over 12 years ago

Over/underdetection of bicycles: <a href="http://imgur.com/a/tIS6c" rel="nofollow">http://imgur.com/a/tIS6c</a>

gourneauover 12 years ago

Very nice. Would y'all consider offering some embeddable solution that does not need to be ran on the net.

评论 #5045808 未加载

misleading_nameover 12 years ago

danielharanover 12 years ago

Are you willing to pay for this service but need higher accuracy? I'd love to hear from you.

mephi5t0over 12 years ago

pkreinover 12 years ago

seems pretty good, but my first test found a potted plant in the aeroplane demo picture -- a 100 story potted plant :P very cool idea, super hard problem so mad respect regardless!

leoplctover 12 years ago

I tried to search for a cow into a horse's image, but it failed

sinzoneover 12 years ago

Hi, would love to have this API on Mashape.com