I tried it with a paragraph of English taken from a formal speech, and it sounded quite good. I would not have been able to distinguish it from a skilled human narrator.<p>But then I tried a paragraph of Japanese text, also from a formal speech, with the language set to Japanese and the narrator set to Yumiko Narrative. The result was a weird mixture of Korean, Chinese, and Japanese readings for the kanji and kana, all with a Korean accent, and numbers read in English with an American accent. I regenerated the output twice, and the results were similar. Completely unusable.<p>I tried the same paragraph on ElevenLabs. The output was all in Japanese and had natural intonation, but there were two or three misreadings per sentence that would render it unusable for any practical purpose. Examples: 私の生の声 was read as <i>watashi no koe no koe</i> when it should have been <i>watashi no nama no koe</i>. 公開形式 was read as <i>kōkai keiji</i> instead of <i>kōkai keishiki</i>. Neither kanji misreading would be correct in any context. Even weirder, the year 2020 was read as 2021. Such misreadings would confuse and mislead any listeners.<p>I know that Japanese text-to-speech is especially challenging because kanji can often be read many different ways depending on the context, the specific referent, and other factors. But based on these tests, neither PlayAI nor ElevenLabs should be offering Japanese TTS services commercially yet.