科技回声

12 条评论

sergeyk超过 13 年前

This is a toy example of the kind of problem that the field of Computer Vision is actively working on: object detection. In a (tiny) nutshell, our best answer for general images and objects is:1) Instead of using the full color pixel image, use an "edge image" with some simple additional normalizations. If color is important, do this per color channel.2) Create a dataset with as many cropped examples of the target object as you can find (mechanical turk is useful for annotating large datasets); every other crop of every image is a negative example.3) Train a classifier (SVM if you want it to work, neural network if you're so inclined) using this dataset.4) Apply the classifier to all subwindows of a new image to generate hypotheses of the target object location. This can be sped up in various ways, but this is the basic idea.5) Post-process the hypotheses using context (can be as simple as simply finding the most confident hypotheses within a neighborhood).If you're interested in object detection, an excellent recent summary of the recent decade of research is due to Kristen Grauman and Bastian Leibe: <a href="http://www.morganclaypool.com/doi/abs/10.2200/S00332ED1V01Y201103AIM011" rel="nofollow">http://www.morganclaypool.com/doi/abs/10.2200/S00332ED1V01Y2...</a> (do some googling if you don't have access to this particular PDF).A cool paper from a few months ago that should be mentioned when commenting on a post called "Where's Waldo?" is <a href="http://www.cs.washington.edu/homes/rahul/data/WheresWaldo.html" rel="nofollow">http://www.cs.washington.edu/homes/rahul/data/WheresWaldo.ht...</a>

评论 #3368857 未加载

TamDenholm超过 13 年前

Something unrelated but perhaps interesting to some people, "Waldo" is actually a localised name for the USA and Canada, his original name is Wally.<a href="http://en.wikipedia.org/wiki/Where%27s_Wally%3F" rel="nofollow">http://en.wikipedia.org/wiki/Where%27s_Wally%3F</a>

评论 #3368289 未加载

6ren超过 13 年前

Are there other examples of it working? (if there were links, I couldn't see them).There's a danger of overfitting, where a technique works for one instance (or a subset of instances), but not in general. Detecting stripes could work in general, but as a SO commenter noted, "Where's Wally" images often include spurious stripes to undermine this detection strategy for humans.

评论 #3377450 未加载

dice超过 13 年前

The algorithm described by Heike is essentially just looking for striped red and white shirts. Anyone who's done more than a couple of "Where's Waldo?" games knows that striped shirts are often thrown in to draw one's eye. In fact, in this very example there is another striped shirt (lower left corner, just above the wall) which could very well have been Waldo that this algorithm did not highlight. Without being able to recognize Waldo's human characteristics (thin, glasses, strong chin) the approach described will inevitably fail.

rgarcia超过 13 年前

I had to play around a little with the level. If the level is too high, too many false positives are picked out.I was impressed until I read that--the guy is basically fitting the model/procedure to the training set (of size 1). I'd wait for a more general approach before accepting the answer.

re超过 13 年前

On NPR, this turns into: "an algorithm that can find Waldo in any image."<a href="http://www.npr.org/blogs/waitwait/2011/12/18/143865340/the-wait-wait-snack-pack" rel="nofollow">http://www.npr.org/blogs/waitwait/2011/12/18/143865340/the-w...</a> via <a href="http://meta.stackoverflow.com/questions/116401/stack-overflow-mentioned-on-nprs-wait-wait-dont-tell-me-and-ny-times" rel="nofollow">http://meta.stackoverflow.com/questions/116401/stack-overflo...</a>

评论 #3368793 未加载

ofca超过 13 年前

Programming potential never ceases to amaze me. I want to learn more. NOW!

评论 #3368306 未加载

kevinalexbrown超过 13 年前

Cool. I've done some work on things like this before. Some of the things I do to make it work on multiple images:Template matching is your friend in this case, because most Waldos look similar. You already tried this in a basic way by searching for the stripes of a given color. You can make it more powerful by making the template include more properties, and work in more contexts. For instance: what if Waldo's a different size?The other option is to pretend you don't know what Waldo looks like, find him in a bunch of images, label the subimages as "waldo" candidates, measure certain properties of those subimages, and find which of coordinates of feature space have similar properties. Then use these properties as your template.Finally, you could train a classifier on subwindows like sergeyk suggested. This has some difficulty because where's waldo images are difficult to subdivide into subwindows on the scale of a single person. Do you move pixel by pixel? Do you divide it into a grid? Each grid will contain weird parts of people in each box. Etc. If you do find a way to divide the image into "people" -- perhaps by doing a preliminary "person"-template sweep that identifies locations of people in the image -- then you can use a supervised learning algorithm to say "yes, this person is waldo" or "nope, FRWONG!", based on the image properties in the subwindow around that person.

viscanti超过 13 年前

This needs to be an augmented reality mobile app. The problem on the AI side of things is that a good algorithm that reliably "learns" what Waldo looks like would need a substantial number of examples.A good solution to this would get close, then calculate the probabilities of every "maybe-waldo" and then display the one with the highest probability of being Waldo. An augmented reality app that highlighted Waldo on every page would be awesome.

评论 #3368769 未加载

danso超过 13 年前

Amusing application, but I'd like to see the version that finds Waldo on the page in which everyone is wearing striped shirts

评论 #3368091 未加载

brianbreslin超过 13 年前

interesting problem. i'd like to then apply this concept of finding a needle in a haystack to satellite imagery. Using super-computing + giant image data sets, you could theoretically find some pretty obscure stuff if you knew what you were looking for (hidden treasures???).

jastr超过 13 年前

This is undoubtedly a data point on the path to the singularity.