Wow, this is pretty intriguing and may actually be a solid differentiator in the browser space since it probably requires a considerable stored database of speech samples coupled with a decent back end server farm to do it effectively. Hard for the other players to replicate. Clever move Google!<p>I wonder if they will add this as a standard feature for any text field at some point? It's probably not going to get much sunlight if it requires a chrome-specific attribute on the field.
I suspect this slide's covert purpose is to make me (yes, just me) sit here and say 'hello' to my computer like a moron for 3 minutes while seeing no effect what-so-ever. In this it has succeeded brilliantly.<p>Running Chrome, mic is on, no one is home... sigh. I so wanted to be wowed.
It actually does a pretty good job at simple words and sentences. So, jumping in the deep end, I tried "The reflected binary code was originally designed to prevent spurious output from electromechanical switches". Can anyone get it to recognize that? I did manage to get it to respond correctly to every word by itself (sometimes only after a couple of tries), but not the whole thing.<p>(non-native speaker)
Varies from poor to amazing.<p>"I have met Jesus, he was a nice guy" -> "ice melt cheese"<p>"hacker news is amazing" -> "hacker news"<p>"are you afraid of santa claus?" 100% correct<p>"if a woodchuck could chuck wood how much wood would a woodchuck chuck" 100% correct
Did anybody check out the slide before this, device orientation? <a href="http://slides.html5rocks.com/#slide23" rel="nofollow">http://slides.html5rocks.com/#slide23</a><p>That's pretty awesome too, I could see this being great for mobile web apps, especially games.
Why does this have anything to do with HTML5 -- isn't it up to the UA to determine how best to accept form input? Specifying in the form that a particular field is a "voice recognition" field seems to be encoding presentation details in what should be structure.<p>I can understand that it's important to mark a particular form field as more "important" than others (and thus more likely that a user would like to use their voice to input text to it), but wouldn't this be better served by semantic markup declaring the field as a "primary" field or some such?
I whipped up a Chrome extension for voice search if anyone is interested:<p><a href="http://dl.dropbox.com/u/1047706/VoiceSearch.crx" rel="nofollow">http://dl.dropbox.com/u/1047706/VoiceSearch.crx</a><p><a href="https://github.com/raneath/chrome-voice-search" rel="nofollow">https://github.com/raneath/chrome-voice-search</a>
This is exactly like the speech recognition on Android. It works brilliantly with short phrases that also happen to be popular searches on Google (or Google Voice Search) but fails at longer or obscure sentences. It's all about the data, baby.<p>I use Voice Search heavily on my Desire, but I prefer to type out my communications because of this exact limitation.
That is awesome, works even for German without a problem. I couldn’t get it to recognize an English sentence properly (which probably only means that my English pronunciation is horrible). I’m wondering, however, how they manage to recognize the language in the three word sentences I tried.
It was rather good, but not even close to rely on it for anything practical. It felt a bit like this <a href="http://www.youtube.com/watch?v=5FFRoYhTJQQ" rel="nofollow">http://www.youtube.com/watch?v=5FFRoYhTJQQ</a>
What version of chrome does this work on?
Either I'm missing something or on an older version of chromium: Chromium 5.0.375.127 (Developer Build 55887) Ubuntu 10.04.
Two things I would want upon seeing this.<p>1. Chrome extension to use speech recognition in every text box.<p>2. Speech recognition inside the google apps: Gmail, etc.