AI TTS and SSML functionality

Staff

2 months ago

Hi PeterGrennan, BrendtWaters, and arabellas,

You’re right to call this out, especially with the examples you’ve all shared around pronunciation and context.

I appreciate you checking in on this as well. While I don’t have a specific update to share right now, this is still something we’re actively tracking.

At the moment, AI text-to-speech is designed to infer pronunciation based on context, but it doesn’t offer the same level of control as traditional TTS when it comes to fine-tuning output. That’s where cases like acronyms, industry-specific terms, or words with multiple pronunciations can become challenging, especially when context is limited, like in headings or short phrases.

The examples you’ve all provided, from healthcare terminology to words like “record” and “lead,” are really helpful in highlighting where more control is needed. I can also see how features like phoneme support or a pronunciation library would make a big difference in reducing rework and improving consistency.

I’ve added your feedback to the existing request, including these newer use cases. We’re continuing to gather input as the team explores ways to improve pronunciation control in AI voices.

Forum Discussion

Related Content

Adding SSML codes, can't update TTS

Pronunciation in AI Audio (TTS)

Insert SSML tags

Chinese TTS voices

SSML Guidance

Learn

Community Blog

Connect

Community

Company

Trust Center