Arabic language - diacritics - Scripts and texts in slides

Jun 05, 2023

Dear all,

We are translating courses for one of our major customers, from English into Arabic (modern). Their courses contain texts in slides and scripts (Text-to-speech).

Arabic translators never use diacritics when translating since they  interpret the text and pronounce it correctly according to the context, which the robot in Articulate doesn't seem to do.

While doing the target formatting, our DTP specialist told us that the robot in Articulate does not pronounce the words correctly when there are no diacritics (something we did not know).

The only solution is to add diacritics, when required, whether the texts are in the slides or in the scripts.

Are you aware of this problem and is there a way around it?

We consulted several forums and found no discussion on this subject.

 

Thank you for your help.

Bertrand Bonaventure

 

2 Replies
Jose Tansengco

Hi Bertrand,

Thanks for reaching out!

I haven't seen this issue reported by other users so we'll need to take a closer look at the behavior to figure out what's happening. I opened a support case on your behalf to get you in touch with our support team, and they'll be reaching out by email momentarily to help you troubleshoot.

Jürgen Schoenemeyer

>which the robot in Articulate doesn't seem to do.

change "Articulate" to "Amazon", it's a problem with Amazon Polly (used by Storyline)

Short vowels (diacritics) are not part of the Arabic alphabet. As a result, one written form might be pronounced in several different ways with every option carrying its own meaning and representing a different part of speech. Vocalization can’t be performed in isolation because correct pronunciation depends heavily on the linguistic context of each word. In a real life situation Arabic readers add diacritics during reading to disambiguate words and to pronounce them correctly. In the TTS voice development process Arabic requires a diacritizer that predicts the diacritics. The Amazon Arabic TTS voice handles unvocalized Arabic content thanks to the in-build diacritizer. If a customer provides vocalized input, Zeina generates the corresponding audio as well.

the in-build diacritizer seemes not be perfect