TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

This Uncanny Valley of Voice Recognition

75 点作者 hodgesmr超过 10 年前

15 条评论

anatari超过 10 年前
Voice recognition is not in an uncanny valley. Uncanny valley means there is a point where something that is less real is better than something that is more real. Pixar improves the scene by adding elements that are unrealistic. Another example is a preference for lower frame rate movies.<p>Right now, every incremental improvement to voice recognition improves its usefulness. It might appear that we&#x27;re in an uncanny valley because voice recognition is barely usable right now versus completely unusable in the past, but there is no one that prefers worst voice recognition over better voice recognition.
评论 #9023723 未加载
评论 #9023646 未加载
评论 #9024377 未加载
评论 #9024319 未加载
评论 #9025360 未加载
评论 #9023696 未加载
评论 #9024173 未加载
51Cards超过 10 年前
&quot;The Uncanny Valley is a term that originated from the computer animation industry. In 1992, while finishing A Bug’s Life, Pixar had to build a digital valley for...&quot;<p>Ummm....<p>Wikipedia: &quot;The term was coined by the robotics professor Masahiro Mori as Bukimi no Tani Genshō in 1970. The hypothesis has been linked to Ernst Jentsch&#x27;s concept of the &quot;uncanny&quot; identified in a 1906 essay &quot;On the Psychology of the Uncanny&quot;.&quot;
评论 #9024855 未加载
b6超过 10 年前
I apologize, I&#x27;m pretty sure I feel the way I do because I&#x27;m getting old, but here&#x27;s how I feel: talking to computers is a really, really bad interface, so I don&#x27;t do it.<p>One reason it&#x27;s bad is that the sounds we make are mush. It&#x27;s a miracle if a computer system can correctly retrieve the words from an utterance. Another reason it&#x27;s bad is that the words we say are nonsense. Our sentences aren&#x27;t parseable, they don&#x27;t conform to any actual grammar.<p>So I see it as another example of people selling something that&#x27;s supposed to be more convenient than what we already have, but for many reasons, it probably isn&#x27;t. One day it may be, but it wouldn&#x27;t be surprising for people to be selling it as more convenient for many years before it actually is.<p>I&#x27;m not criticizing the technology -- it&#x27;s amazing. It&#x27;s just clear to me that it isn&#x27;t ready to be invited into my life. I consider it inevitable that we will eventually lose control of technology, but we can at least try to be judicious.
评论 #9025291 未加载
评论 #9025432 未加载
评论 #9025688 未加载
评论 #9026181 未加载
评论 #9025401 未加载
Jgrubb超过 10 年前
This post reads like he&#x27;s experimenting with the Uncanny Valley of Auto-generated Blogging.
zanny超过 10 年前
I&#x27;m still bummed that with all these companies implementing voice recognition there still is not anything close to a FOSS option. It is a major field and the kind of software that takes a huge amount of work to get right and I feel like in the future free operating systems are going to look archaic without it, but it does not seem like the kind of thing any small club of friends can pick up and build to match Google or Apple at.<p>The same applies to OCR and other photo recognition techniques like faces or red eye. Tesseract is probably the largest free software OCR project but it still seems to do so much worse than proprietary Adobe and Microsoft products. At least the OCR reader that came with my S4 does a terrible job, though it might be using Tesseract behind the scenes since I think its the one from f-droid.<p>Digikam does all right red eye correction but it does it with a layered filter rather than any recognition of eyes. It also sometimes can find faces, but not nearly as accurately as Google can.<p>All these fuzzy logic fields are things that take huge code bases and a lot of R&amp;D to get right and nobody in the free software movement has the organization or just the raw bank to make them happen from what I can see. Red Hat surely is not investing in them (kind of outside their enterprise &#x2F; server domain) and they are about the only company prominent and powerful enough to do it.
评论 #9024700 未加载
评论 #9024690 未加载
aaronpk超过 10 年前
Is it just me or is the author using the term &quot;Uncanny Valley&quot; completely wrong? Ignoring the silly Pixar story, I still don&#x27;t understand how voice recognition (or more accurately, speech recognition) is currently in the uncanny valley.<p>You know when your GPS says &quot;recalculating&quot; in a condescending voice? <i>That&#x27;s</i> the uncanny valley of text-to-speech.
评论 #9024304 未加载
评论 #9025349 未加载
评论 #9024307 未加载
评论 #9025504 未加载
VLM超过 10 年前
One important point about voice recognition is in the short term its OK if its slower and harder to use than superior technologies, as long as everyone knows it costs a lot of money.<p>Once that fad aspect blows over, then usage plummets and its forgotten. See Kinect, or the nintendo power glove, or qr-codes, or google glass, or the cue cat, or a zillion other examples that are in, or now entering, 8-track-hood.
评论 #9025102 未加载
wodenokoto超过 10 年前
Uncanny valley did not originate in the computer animation industry. It originated in robotics, and was coined by Masahiro Mori in 1970.<p><a href="http://en.m.wikipedia.org/wiki/Masahiro_Mori" rel="nofollow">http:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Masahiro_Mori</a>
idbehold超过 10 年前
&gt; The Uncanny Valley is a term that originated from the computer animation industry.<p>Uhh, I don&#x27;t think so: <a href="https://en.wikipedia.org/wiki/Uncanny_valley#Etymology" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Uncanny_valley#Etymology</a>
评论 #9023749 未加载
yehat超过 10 年前
The Uncanny Valley of HN comments gives plenty of credit to the author&#x27;s post (which I enjoyed). Most of the comments sounds like coming from underdeveloped AI - awkward perception and total lack of human sense of humor.
kleiba超过 10 年前
&quot;Voice recognition&quot; sounds more reminiscent of speaker identification than speech recognition to me. Although I don&#x27;t work on that myself, my day job is in a related field, and even IBM&#x27;s &quot;speech to text&quot; is a term I never hear being used (unlike for instance &quot;text to speech&quot;). People around me either say &quot;speech recognition&quot; or &quot;ASR&quot; (for automatic speech recognition).<p>I&#x27;d be interested to learn, though, if &#x2F; where alternative terms are in more wide-spread use.
评论 #9025588 未加载
alttab超过 10 年前
The truly best voice recognition pretty much has to be hooked up to an uber-AI. Gaining a friend, yes.<p>Imagine if they could program things into it that would make your overall life better by slightly altering your behavior? For instance, if you asked &quot;Siri&quot; to remind you every 40 minutes for a ciggarette break, I can imagine her slowly weaning you off, etc.
sukilot超过 10 年前
I think Zach was drunk on this one.<p>Also, &quot;eat a dick&quot;, really?
shanselman超过 10 年前
Um. No.
tashoecraft超过 10 年前
I&#x27;d really like it if someone condensed that article and removed all attempts to be funny or sound extremely clever.
评论 #9023793 未加载
评论 #9023813 未加载
评论 #9024068 未加载
评论 #9023822 未加载