Forum Discussion

CindeeCalton's avatar
CindeeCalton
Community Member
5 months ago

Emphasis and neural voices

Hi! I'm new to using text-to-speech.

I see in this article that you cannot use the <emphasis> tag on a neural voice: SSML support

I see here that there are no standard voices in American English: Selecting languages and voices.

So does that mean there is no way to edit the emphasis in a sentence for an American English voice? Is there a workaround? I like the Danielle voice for the most part but I've got a few sentences she is doing a pretty bad job with.

Thanks!

  • Great question.  Amazon's article about SSML tagging states: "Emphasizing words changes the speaking rate and volume."

    So you could still build your own emphasis by adjusting both the rate and volume level - more steps, but a viable workaround.

  • CindeeCalton's avatar
    CindeeCalton
    Community Member

    Thanks for your reply, Ron. I'm trying to figure out how to put both volume and speed tags on one word. I tried nesting one in the other and got an error. Can I only do one?

    Thanks again!

     

    • JoseTansengco's avatar
      JoseTansengco
      Staff

      Hi C.J., 

      Happy to troubleshoot your SSML code for you. Would you mind sharing how you're trying to adjust the volume and speed of your text here so we can check if the structure of the tags is correct? Using the <prosody> tag should work for adjusting both volume and speed. Here's an example of the correct usage of the prosody tag:

      <speak>
           <prosody volume="loud" rate="x-slow">Hello</prosody>
      </speak>  

      You'll see in the example that you can use more than one modifier for the tag, the syntax just has to be correct.

  • CindeeCalton's avatar
    CindeeCalton
    Community Member

    Thank you so much! I did put that in and I get no errors, so that's a start. But I don't hear ANY difference with it. Am I doing something wrong? I am using Danielle if that helps.

    <speak>
    This module focuses on <prosody volume="x-loud" rate="x-slow">your</prosody> role. Carrying out the plan.
    </speak>

  • You can hear it better with a break before and after the emphasis.  

    <speak>
    This module focuses on <break time="30ms"/>

    <prosody volume="x-loud" rate="x-slow"> your</prosody>

    <break time="20ms"/>

    role. Carrying out the plan.
    </speak>

    Use the below example to adjust the volume and rate manually. 

    <speak> This module focuses on <break time="30ms"/> <prosody volume="+45dB" rate="70%">your</prosody><break time="20ms"/> role.  Carrying out the plan.
    </speak>

     

  • CindeeCalton's avatar
    CindeeCalton
    Community Member

    Thanks to all that tried to help. I tried adding pauses as Suzanne suggested. I was able to hear the changes better but no matter how short I made the pause, it sounded very weird. I think it needs a pitch change to work. I am just going to change the script to something that doesn't sound weird. Maybe Articulate will add the ability to edit pitch in neural voices soon.

    Side note: is the only way to test your markup to save it, and open the audio and go back and forth?

    • JoseTansengco's avatar
      JoseTansengco
      Staff

      Hi C.J.,

      Yes, that is correct. The audio file needs to be generated by Storyline 360's text-to-speech service before you can preview it so there isn't a way to test the markup without first saving.