TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Voice Synthesis for in-the-Wild Speakers via a Phonological Loop

65 点作者 itamarb将近 8 年前

9 条评论

bluetwo将近 8 年前
I still think emphasis on a word or syllable is important here as there is far more information than you realize being conveyed with inflection.<p>Consider:<p><i>I</i> am going to eat the ham sandwich = Me, no one else<p>I <i>am</i> going to eat the ham sandwich = Nothing can stop me<p>I am <i>going</i> to eat the ham sandwich = On my way; got distracted<p>I am going <i>to</i> eat the ham sandwich = In case you doubt my intent<p>I am going to <i>eat</i> the ham sandwich = I will not be juggling it<p>I am going to eat <i>the</i> ham sandwich = The ultimate ham sandwich will be mine<p>I am going to eat the <i>ham</i> sandwich = Not turkey, not roast beef<p>I am going to eat the ham <i>sandwich</i> = Between two slices of bread is what I do
评论 #14852511 未加载
评论 #14852264 未加载
olegkikin将近 8 年前
Similar in quality to Lyrebird<p><a href="https:&#x2F;&#x2F;soundcloud.com&#x2F;user-535691776&#x2F;dialog" rel="nofollow">https:&#x2F;&#x2F;soundcloud.com&#x2F;user-535691776&#x2F;dialog</a><p>Google WaveNet sounds almost perfect in comparison:<p><a href="https:&#x2F;&#x2F;deepmind.com&#x2F;blog&#x2F;wavenet-generative-model-raw-audio&#x2F;" rel="nofollow">https:&#x2F;&#x2F;deepmind.com&#x2F;blog&#x2F;wavenet-generative-model-raw-audio...</a>
评论 #14850695 未加载
abhishek0318将近 8 年前
Mix this with AI creating video from audio (<a href="http:&#x2F;&#x2F;spectrum.ieee.org&#x2F;tech-talk&#x2F;robotics&#x2F;artificial-intelligence&#x2F;ai-creates-fake-obama" rel="nofollow">http:&#x2F;&#x2F;spectrum.ieee.org&#x2F;tech-talk&#x2F;robotics&#x2F;artificial-intel...</a>) and you can make anyone say anything.
Animats将近 8 年前
Coming soon, audio ads with your friend&#x27;s voices.
评论 #14850563 未加载
azinman2将近 8 年前
To me this is very exciting. I&#x27;m already working on my own home digital assistant modeled as NeNe Leaks from the Real Housewives to add personality to otherwise boring conversations with a robot. I&#x27;ve been looking at various style transfer techniques, and having something a bit more plug &amp; play will help me focus on the more unique parts. I predict that we&#x27;ll see more celebrity voices used as conversational interfaces become more common.<p>Part of the complexity is going from &#x27;context-free phonemes&#x27; to actually modeling personality. Having some way for the voice to know how to embed emotion, and ideally contextually from the sentences themselves. NeNe is an interesting example as she adds so many non-verbal sounds to her dialog (bleeps and bloops and eye rolls that she translates into affected speech). That&#x27;s part of what makes her NeNe, and a big part of the entertaining value. Pursuing that is what will bring style transfer to the next level... total personality emulation. I fantasize about basic animatronics that can move her head side to side, twirl, and literally give eye rolls.<p>If anyone wants to work on this with me, give me a ping @azinman on twitter. I&#x27;ve currently been thinking about this as an open source project, but still holding out options as I continue development. I&#x27;ve got a ton more ideas she&#x27;s integrating into with my bleeding edge smart home, far more than just personality emulation (including what I believe to be a breakthrough in passive context-sensing.. the real key to making the smart home actually smart).
评论 #14852341 未加载
johannkaupen将近 8 年前
There are too many example to do fraud with this to list here.<p>One example: Not too long ago I still did the rather more important banking stuff with a quick phone call (couldn&#x27;t be done entirely online).
评论 #14850827 未加载
digi_owl将近 8 年前
For some reason this page gives Firefox a fit, and that is with multiprocessing enabled...
placeybordeaux将近 8 年前
Anyone else having trouble with the audio samples?
m00dy将近 8 年前
I&#x27;m waiting for the code samples :)<p>Thanks