64 pointsby daisystantonalmost 7 years ago

3 comments

bussalmost 7 years ago

Wow, these audio samples are incredible. I'm surprised to hear the model actually outputting natural-sounding breathing between and inside sentences. Most TTS systems explicitly remove things like that, but the addition of breathing makes it sound so much more natural.<p>The style tokens result in pretty incredible and realistic audio.

评论 #17728937 未加载

zestypingalmost 7 years ago

Good heavens. These give me the shivers. In some of these samples you can hear breathing, emphasis, and even what sounds like genuine emotion.

sgillenalmost 7 years ago

This seems like it could be great for automatically generating audio books. Personally I would one day like to have a program that can read arbitrary text to me in a more or less human way, that would allow me to read papers for work while driving.

评论 #17736517 未加载

评论 #17736360 未加载

评论 #17736456 未加载

Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis

3 comments

Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis

3 comments