Hey there! I'm one of the authors of the paper and I'm happy to answer any questions anyone may have!<p>Make sure to check out the paper on arxiv as well.
Interesting. They are not TTS like we are accustomed to, they are replicating a specific persons voice with TTS. Listen to the ground-truth recordings at the bottom and then the synthesized versions above. "Fake News" is about to get a lot more compelling when you can make anyone say anything as long as you have some previous recordings of their voice.
Semi-related to the Baidu speech research:
<a href="http://chrislord.net/index.php/2017/02/23/machine-learning-speech-recognition/" rel="nofollow">http://chrislord.net/index.php/2017/02/23/machine-learning-s...</a><p>The work is done by Mozilla
> "We conclude that the main barrier to progress towards natural TTS lies with duration and fundamental frequency prediction, and our systems have not meaningfully progressed past the state of the art in that regard."<p>Who is working on this problem, and how?
OK, that went from uncanny valley to flipping amazing. I could picture the person speaking. An old lady. A young woman. It was hard to picture an algorithm in a machine.<p>It's amazing that is all boils down to 1s and 0s and some boolean logic.
Has anyone seen this yet? <a href="https://www.youtube.com/watch?v=XfcqBElF0ZI" rel="nofollow">https://www.youtube.com/watch?v=XfcqBElF0ZI</a><p>So many innovations happening with voice related technology..
very nice paper - one of my colleagues discovered it. I have been trying to understand the details but I do not see how your stacked dilated layers are arranged. "d" is mentioned once but no description given
if i understand this correctly it's a pretty big achievement on the way to being able to replicate any persons voice in the future given enough audio samples. Amazing.
Similarly i have seen lip movement (talking) be replicated using machine learning. Having completely artificial (or even real) identities saying whatever you want them to on video is not that far off i guess (simpler than general AI or even fully self driving cars), which is both amazing and terrifying.