TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis

64 pointsby daisystantonalmost 7 years ago

3 comments

bussalmost 7 years ago
Wow, these audio samples are incredible. I&#x27;m surprised to hear the model actually outputting natural-sounding breathing between and inside sentences. Most TTS systems explicitly remove things like that, but the addition of breathing makes it sound so much more natural.<p>The style tokens result in pretty incredible and realistic audio.
评论 #17728937 未加载
zestypingalmost 7 years ago
Good heavens. These give me the shivers. In some of these samples you can hear breathing, emphasis, and even what sounds like genuine emotion.
sgillenalmost 7 years ago
This seems like it could be great for automatically generating audio books. Personally I would one day like to have a program that can read arbitrary text to me in a more or less human way, that would allow me to read papers for work while driving.
评论 #17736517 未加载
评论 #17736360 未加载
评论 #17736456 未加载