Forum Discussion
Japanese AI Text-to Speech Quality
Has anyone had success using a particular voice for the AI Text-to-Speech for Japanese? I have tried several and continue to get feedback that the text-to-speech quality is very poor, and some voices randomly mix Japanese, Chinese, and Korean accents even within a single paragraph. All of the translated audio text has been thoroughly reviewed and is accurate. Here are the latest voices I have tried and the general feedback received:
- Alva: Good (the voice used in the course)
- Akira: Poor
- Hajime: Poor
- Gojo: Acceptable
- Ken: Good (possible alternative, but does not address the root cause)
- Masa: Acceptable
In the Advanced settings, I have Multilingual v2 selected for each one.
Note that even for the voices marked good or acceptable, the reviewer still indicates that the quality is not at a level that they feel can be used. They believe the root cause is that the program invoking text-to-speech does not consistently retain the selected voice option and that it incorrectly auto-detects the language and switches voice options even within a single sentence. I am not sure how to respond to that.
I would appreciate any insights anyone else has about this topic!
2 Replies
- LisaDobias-c0c7Community Member
I should note that the voice feedback above reflect ease of listening, not pronunciation accuracy.
Hello LisaDobias-c0c7,
I appreciate you reaching out! While we don't have any news on any additional advanced settings for AI TTS Voices, I've shared your insight with the product team. We'll be sure to share any future updates in this thread, so all are in the loop!
I'm curious if you've tried the following:
- Do you notice any change to the ease of listening if you generate audio with v3 (Beta) instead of Multilingual v2?
- Have you used any supported SSML tags to improve cadence or flow?
I'll open this discussion up to our fabulous community to share their experiences and what has worked well for them!
Related Content
- 10 months ago