TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Google has developed speech-recognition technology that actually works.

101 点作者 pelf大约 14 年前

10 条评论

orijing大约 14 年前
Sorry to be a killjoy, but the premise of this article--that <i>Google</i> developed the speech-recognition technology--hurts my feelings (to say the least) and underestimates the contributions of the NLP community.<p>Speech recognition, like machine translation, is academic in origin, and much of the work is still carried out in academia. For example, Google did not "invent" machine translation. No, Google Translate is an adapted version of academic systems. Perhaps the phrase tables are sharded and so is the language model, but the general algorithms are the same. Sure, one of Berkeley's NLP grads is working there, but it's basically an adapted version of what's available. They publish papers like "Stupid Backoff" [1], but that makes them as much a contributor as any other member of the NLP community.<p>Speech recognition is the same thing. Google is the company that takes existing research and adapts it.<p>To claim that Google developed the speech recognition technology is to discredit the contributions of <i>everyone else</i> in the NLP community. Google has been generous at funding NLP research at the university level. Do you consider the results of those research "Google"'s?<p><i>Ultimately, the main difference is that Google has magnitudes more data and the physical capacity to handle that, not that it solved some systems or architectural bottleneck that has been limiting us.</i> Someone once said that all you need is a crappy model and great data to build a good ML-based algorithm...<p>[1] <a href="http://acl.ldc.upenn.edu/D/D07/D07-1090.pdf" rel="nofollow">http://acl.ldc.upenn.edu/D/D07/D07-1090.pdf</a>
评论 #2424497 未加载
评论 #2424404 未加载
评论 #2424365 未加载
评论 #2425671 未加载
评论 #2425077 未加载
评论 #2424932 未加载
评论 #2424771 未加载
JesseAldridge大约 14 年前
I tried it up. It doesn't work went well services dept.<p>Maybe this is my voice doesn't give me a couple of words wrong in all 50 states. Did welcome center for dental bar kansas city missouri.<p>----<p>I tried it out. It doesn't work quite as well as the article suggests.<p>Maybe it's just my voice, but it gets at least a couple of words wrong in almost every statement. "It", "welcome", and "thoughts", for example, are consistently misheard.
评论 #2424811 未加载
评论 #2425391 未加载
评论 #2427754 未加载
IDisposableHero大约 14 年前
It got "<i>how much wood would a woodchuck chuck if a wouldchuck would chuck wood</i>" right for me. I am impressed now.<p>But it couldn't handle "<i>Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn</i>" Maybe I'm not pronouncing it quite right.
评论 #2425718 未加载
评论 #2424356 未加载
antiterra大约 14 年前
It seems even strange to me that the article makes no attempt to survey other current examples of speech-recognition technology in order to support the unsaid implication of their lede. That is: "speech-recognition technology developed by those other than Google does not work."<p>I just downloaded the Bing app for iPad last night, and noticed it has a pretty decent speech-recognition engine from a company they acquired a few years back: TellMe. I tried all the examples given in the Slate article, and they were recognized just fine.<p>This makes me curious, are there a number of current-generation speech-recognition technologies that work at the level of Google's?<p>I should note that I didn't receive the desired behavior once my speech was recognized. When I asked the math question, a link to Wolfram Alpha was given, but I would have to click that link to get the answer. I had to go to maps to get any kind of relevant answer for "Directions to McDonalds," and I had to just say McDonalds and then click Directions to get the actual information. This failing appears to be a trait of the iOS app itself. Hand typing the math query into Bing on a proper browser did give me Google Calculator style results.
评论 #2424780 未加载
MatthewPhillips大约 14 年前
I wonder at what point Apple is going to have to start building these kinds of technologies. Purchasing Siri was supposed to be a step in that direction but nothing has come of it yet.<p>I worry that it's not in Apple's DNA to build products they can't directly charge for. Apple doesn't do freemium. The hope would be that third parties would pick up the slack.<p>However I'm not sure that startups can match Google in bigdata. So how will Apple catch up in voice recognition? In mapping (which also uses Android data to improve things like traffic and rerouting)?
评论 #2424680 未加载
评论 #2424953 未加载
micah63大约 14 年前
Whatever anyone wants to say negatively about this article, it's bang on. I just got my first android phone, it has 2.2 and the little microphone is my new best friend. I talk out todos, write emails, texts, search youtube, search everything, it blows my mind. The only thing it's not good at is uncommon names and places (like my name "Micah" and "Quebec"). Rock on Google.
martythemaniak大约 14 年前
"It even works if you've got an accent."<p>I have an indeterminate accent and my voice is on the low side of Bass so I trip up google voice pretty badly - it's mostly useless for me. FWIW, Rockband also can't make sense of what I say. I do wonder when it'll be good enough to understand me.
评论 #2426045 未加载
kajecounterhack大约 14 年前
Does anyone happen to know if there are significant companies in the speech-recognition space besides Dragon and Google?
评论 #2424612 未加载
评论 #2424412 未加载
评论 #2424654 未加载
jodrellblank大约 14 年前
What are the chances that Google can pick up enough from our voices to biometrically identify us from a crowd in future?<p>E.g. On a Google phone conversation or when licensed to a surveillance company.
PonyGumbo大约 14 年前
Hopefully they'll be able to apply this to Google Voice. The voicemail transcriptions are almost always hilariously wrong.
评论 #2425401 未加载