Prime Voice AI: AI speech software

61 pointsby danboarderover 2 years ago

21 comments

What? Most of the voices I tried sound really intense, even angry. Very strange emotional flavors for what should be quite neutral text inputs. The laughing was literally ha, ha, ha, ha. Not even remotely a genuine human laugh.The actual sound quality of the output is impressive (clear treble, no weird artifacts between syllables, etc.), but I just don't understand the weird "edginess" of the speech.

评论 #34608088 未加载

评论 #34608266 未加载

评论 #34608257 未加载

hidelooktropicover 2 years ago

Best I've ever heard. Steep pricing to get only 2hrs a month and only 2,500 characters at a time though. I was about to sign up to use this to read articles to me but that amounts to about 4 articles per month and fed into the generator in parts at a time.

tekni5over 2 years ago

The reason why ElevenLabs is so good is not because of the default voices, it's because it's so easy to train new voices. You only need a minute or two of someone speaking and it can mimic the voice pretty well, good enough to fool most people.However their pricing is completely wrong, should be cheaper and offer more.

评论 #34607371 未加载

评论 #34607637 未加载

ebeip90over 2 years ago

For hobbyist use, is this really any better than macOS' "say" command?Once you've downloaded the Premium voices (e.g. Zoe) it's just a CLI, no API or hidden bells and whistles.<pre><code> $ say -v 'Zoe (Premium)' "This is an example of the Zoe voice for my comment on Hacker News." </code></pre> You'll have to download the voice ahead of time, but Zoe (public) and Maeve (internal) are both excellent voices.

评论 #34609883 未加载

评论 #34607417 未加载

评论 #34617382 未加载

评论 #34607380 未加载

pelasacoover 2 years ago

The voice sounds good. However, I would like to see, if its able to parse and read i.e a PDF file in a good flow. I use (and pay) speechify, in the daily basis, to read through pdf books, for my studies. I see that they still have a lot to improve, but I still couldn't find a better solution. Any suggestion?

评论 #34609632 未加载

oldstrangersover 2 years ago

Its pretty good. I've been using Amazon's Polly which so far to me has been the most realistic (<a href="https://aws.amazon.com/polly/" rel="nofollow">https://aws.amazon.com/polly/</a>). I feel like Polly still has an edge with variety of voices.

Zetobalover 2 years ago

Azure is so far ahead on neural voices it's not even funny.<a href="https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/#overview" rel="nofollow">https://azure.microsoft.com/en-us/products/cognitive-service...</a>

评论 #34608238 未加载

评论 #34607663 未加载

andreykover 2 years ago

Related - I've found BeyondWords to be really nice. Its generated speech is not quite this good, but it's close, and it has a library of fairly different voices. Plus, it's UI allows you to create audio with a mix of voices, which is not offered by most other such services.Plug warning - I've been using it to create narration for short stories with it for a while, and the output is better than I would have expected. Here's a recent example involving two characters talking - <a href="https://storiesby.ai/p/melancholy-musings-over-drinks" rel="nofollow">https://storiesby.ai/p/melancholy-musings-over-drinks</a>

评论 #34607840 未加载

Savaakiover 2 years ago

Have you heard their demo reading the great gatsby? Best TTS I've ever heard by a margin ...<a href="https://www.youtube.com/watch?v=qRPTwPuZLjk">https://www.youtube.com/watch?v=qRPTwPuZLjk</a>

grouphootover 2 years ago

Slow your roll Eleven. Sounds like you have beef with everything I feed you.

sublinearover 2 years ago

The default "Adam" voice sounds life like, but I wouldn't call him "conversational/clear". He sounds too forceful and dramatic like he belongs in a cartoon.

exodustover 2 years ago

With real voice actors, we can direct them to say their lines with more sadness. Or guarded desperation and struggle, on the verge of crying but clinging to hope... etc. This kind of subtle direction is not possible with artificial speech.For narration it can work. But for dramatic character acting in animated films, the results make the characters sound like terrible actors. More granular control is needed over specific words, syllables, tone, emphasis and timing.

jjkmkover 2 years ago

Gave it a test, and wow it's very impressive. A lot you can do with the free version also, hopefully this takes off.

daggersandscarsover 2 years ago

Is there an open source or perpetual license way of “cloning” ones voice?This would be a boon to those who have lost or will lose the ability to speak or speak well. Especially if it can be integrated into communication apps and ones cell phone.The number of people who could use this is going up as the hpv+ head and neck cancer wave ramps up.

fxtentacleover 2 years ago

In case anyone knows, what's the defensible moat here?I can get almost the same quality using open source models. Plus I can fine-tune them to get custom voices. That means any company who needs TTS is cheaper off paying me once to build them a customized open source solution instead of forever paying this company per minute.

评论 #34607995 未加载

dangover 2 years ago

Recent and related:This Voice Doesn't Exist – Generative Voice AI - <a href="https://news.ycombinator.com/item?id=34361651" rel="nofollow">https://news.ycombinator.com/item?id=34361651</a> - Jan 2023 (260 comments)

pigtailgirlover 2 years ago

-- think google Neural2 sounds better - <a href="https://cloud.google.com/text-to-speech/docs/voices" rel="nofollow">https://cloud.google.com/text-to-speech/docs/voices</a> --

pupppetover 2 years ago

I want to hear this voice applied to ChatGPT output.

评论 #34608682 未加载

评论 #34606757 未加载

评论 #34607590 未加载

评论 #34606846 未加载

satvikpendemover 2 years ago

Is there an open source version we can use?

评论 #34607205 未加载

andrewstuartover 2 years ago

Didn’t work from my iPhone

评论 #34608113 未加载

_boffin_over 2 years ago

Wow!