Here are the reasons I was disappointed in Watson's showing (despite handily beating the human competitors). The most obvious was that Watson' auto-clicker was a big advantage over human thumbs, so that Watson got 100% of the points for clues to which all competitors knew the answer (if you asked Watson and the two humans "what's five plus five", Watson would win, but that's not necessarily proof of any sort of computer superiority).<p>The second reason is that IBM was representing Watson as something of a big push in knowledge representation (I just watched a video where they talk about Watson's "informed judgments" about complicated questions for instance). It looks instead like Watson just has an improved ability to disambiguate words relative to previous systems and to do quick lookups that match those words with nearby key terms.<p>For example, on the clue "Rembrandt's biblical scene 'Storm on the Sea of' this was stolen from a Boston museum in 1990", Watson correctly answered "Galilee". But its next two answers were "Gardner Museum" and "Art theft"; no one who "understood" the question in any conventional sense would even consider these as answers because they don't make any sense. Clearly, Watson looked for instances of "Rembrandt", "Storm on the sea of", "stolen", or other phrases from the clue in its text corpus, and found that "Galilee", "Gardner Museum", and "art theft" all frequently occurred when together (because the painting was stolen from the Gardner museum in an instance of art theft), and relatively rarely when not together. "Galilee" probably won out of these three because Watson is tuned to Jeopardy clue styles (whenever there is a quoted phrase in a clue followed by the word 'this', it's always asking for the answer that completes the phrase).<p>Similarly, Watson was far less confident on the clue "You just need a nap!" You don't have this sleep disorder that can make sufferers nod off while standing up." It still got the right answer of "Narcolepsy", but with a relatively low confidence of 64%. "Insomnia" had a confidence of 32% despite clearly being the opposite sort of sleep disorder, and "deprivation" appeared at 13%, despite not being a sleep disorder. Here Watson gets confused because the only term of the clue that appears more frequently with "narcolepsy" than "insomnia" is "standing up"; my guess is that if "standing up" had been replaced by some oddly phrased, uncommonly occurring synonym, Watson wouldn't have been able to come up with an answer, despite the clue conveying exactly the same information.<p>This kind of cleverness is certainly impressive, but it seems like it's an advance in tuning existing techniques to the format of Jeopardy, not an advance that will spark other successful projects down the line. IBM's goal of giving us "the computer from Star Trek" doesn't seem any closer; I don't see any evidence that Watson could have answered a question that required more thought or understanding than a simple text search. If there was the question "how many kings ruled England in between Henry the Fourth and Henry the Eigth" (8), then Ken and Brad would have been able to answer relatively easily, while my guess is that Watson would be stumped.