I was thinking about all the new booming AI applications and wondered how it will affect Text-to-Speech and Speech-to-Text. Personally I hope for better TTS than anything, and for a free one at that but API calls have a cost. Also wondering how TTS could affect the Audiobook market if it became good enough.
On your first point, there's some really good results from this paper:<p>Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
<a href="https://arxiv.org/abs/2301.02111" rel="nofollow">https://arxiv.org/abs/2301.02111</a><p>Website with examples: <a href="https://valle-demo.github.io/" rel="nofollow">https://valle-demo.github.io/</a><p>For your second question, Apple is already rolling out AI-narrated audiobooks. See: <a href="https://arstechnica.com/gadgets/2023/01/apple-rolls-out-ai-narrated-audiobooks-and-its-probably-the-start-of-trend/" rel="nofollow">https://arstechnica.com/gadgets/2023/01/apple-rolls-out-ai-n...</a>