Hola amigos -
I just noticed that https://coqui.ai/ is "Shutting down".<p>I'm building a web app (React / Django) which takes a list of affirmations & goals (in Markdown files), puts them into a database (SQlite), and uses voice synthesis to create voice audio files of the phrases. These are combined with a relaxed backing track (ffmpeg), made into playlists of 10-20 phrases (randomly sampled, or according to a theme: "mind" "body" "soul") and then play automatically in the morning & evening (cron). This allows you to persistently hear & vocalize your own goals & good vibes over time.<p>I had been planning to use Coqui TTS as the local text-to-speech engine, but with this cancellation, I'd love to hear from the community what is a great open-source, local text-to-speech engine?<p>Generally, I learn both the highest quality commercially available technology (example: ElevenLabs), and also the best open-source equivalent. Would love to hear suggestions & perspectives on this. What voice synth tools are you investing your time into learning & building with?
Mozilla's browser tts is kind of not bad, just parse and buffer one sentence at a time and it does all right.<p>For the backend, I've experimented with piper, which has a lot of voices and accents, though it's tricky to buffer and sync long texts.<p><a href="https://github.com/rhasspy/piper">https://github.com/rhasspy/piper</a>