Forum Discussion

MathewRoberts-2's avatar
MathewRoberts-2
Community Member
12 months ago

Using SSML for Text-to-Speech

I am trying to use SSML to craft a dynamic voiceover for one of my Storyline 360 lessons. I am having some issues with the program accepting the script I put in. I also don't see where I can use the standard voices since all my text-to-speech options for English-US seem to be Neural. Here is the script I would like to use, and I appreciate any assistance. 

<speak>
Hello! <break time="500ms"/> This is an example of an SSML-enhanced text.

<p>Paragraphs are separated by a natural pause in speech.</p>

For phonetic pronunciation, use <phoneme alphabet="ipa" ph="tɛkst">text</phoneme>.

This sentence is followed by a natural pause in speech.

This is an **important** point. <prosody volume="+3dB">Highlight this part.</prosody>

This is the 1st <say-as interpret-as="ordinal">first</say-as> example of using <say-as>.

The <sub alias="World Wide Web Consortium">W3C</sub> is how you pronounce the abbreviation.

<mark name="example_tag"/>

This section has dynamic range compression for easier listening. <amazon:effect name="drc"/>

This update is coming to you live, in the style of a newscaster. <amazon:domain name="news"/>

This is a secret. <prosody volume="-5dB">Shh! It's a secret.</prosody>
</speak>

  • More examples like this in your documentation of supported tags would be really helpful.

    From what I read on your pages I was not really able to tell which tags might be open tags and which might be closed. I was also not really aware, at first, that many tags needed attributes. And, to be clear, I am well verse in HTML, but didn't pick up on these things based on how little was provided. Going to Amazon's documentation helped with proper/complete tags, but then it's possible that the tags are not supported... all of this resulted in many error messages, because the tags were just not formed correctly, even though they were in the list.

    And, since there was some messaging about not all tags being supported for all voices (standard vs neural, etc), it left me feeling a bit... underwhelmed, and dubious of the implementation, even though I was very much excited when I first saw that it was now available.

    So, I am happy to see this example, and while this post probably seems negative, I really am just wanting to suggest that more examples get put into the documentation pages, so it's easier to try things out and get good results while playing with the various tags/attributes/options :)

     

    • AnnetteWeaver's avatar
      AnnetteWeaver
      Community Member

      I have to agree that the SSML Support page would benefit from adding information to clarify exactly which SSML tags work in SL -- this is six months later from the original date of this thread and the SSML Support page hasn't been updated since May 9, 2024.

      Maybe create separate tables for which tags work with Standard Voices only, and which work with Neural Voices? A co-worker and I spent over an hour trying to trouble-shoot markup language for a Neural voice only to belatedly realize that the particular attribute we were trying to apply only works with Standard Voices. <sigh>

      • LucianaPiazza's avatar
        LucianaPiazza
        Staff

        Hi Annette, 

        Hope that you're having a great start to your week! Our team appreciates your feedback. I wanted to share an update with you!

        We've updated this article to provide more clarity when checking whether an SSML tag is supported for a particular voice and share additional information on all supported tags.

        Let us know your thoughts! I hope this is helpful! 

  • Hello Matthew, 

    Happy to help!

    Some of the tags in your script can only be used with Neural voices such as the "<amazon:domain name="news"/>" newscaster tag. I also saw some tags in your script that weren't in the correct format. After removing the unsupported tags, and correcting the ones with incorrect format, I found that this version of your script will now work using standard voices:

     <speak>

    Hello! <break time="500ms"/> This is an example of an SSML-enhanced text. <p>Paragraphs are separated by a natural pause in speech.</p> For phonetic pronunciation, use <phoneme alphabet="ipa" ph="tɛkst">text</phoneme>. This sentence is followed by a natural pause in speech. This is an **important** point. <prosody volume="+3dB">Highlight this part.</prosody> This is the 1st <say-as interpret-as="ordinal">first</say-as> example of using say-as. The <sub alias="World Wide Web Consortium">W3C</sub> is how you pronounce the abbreviation. <mark name="example_tag"/> <amazon:effect name="drc">This section has dynamic range compression for easier listening. </amazon:effect>This is a secret. <prosody volume="-5dB">Shh! It's a secret.</prosody>

    </speak>

    Note that the script above has the newscaster tag removed since it wasn't compatible when using standard voices.

    You can check out this article for a full list of standard and neural voices, and here's a list of supported SSML tags as well as how to properly use them for your reference.

    Hope this helps!

     

  • Hi. Completely agree that some more examples would be really useful. 
    I'm struggling quite a bit. Not well-versed in html. 

    I am trying to put together some basic literacy lessons and want letter sounds pronounced phonetically. I can't get the tag right. I tried copying the above example. Storyline accepts the markup, but just leaves silence where the ssss sound is supposed to be. Where have I gone wrong?
    <phoneme alphabet="ipa" ph="s">sss</phoneme>

    • StevenBenassi's avatar
      StevenBenassi
      Staff

      Hi Jessica!

      Sorry to hear you've run into trouble while working with SSML tags in Storyline!

      Testing the tag you shared on my end (64-bit Storyline 360 version 3.85.31840.0 via Windows Parallels on a Mac M1) I did observe the "S" being pronounced. Here's a quick screen recording sharing my results.

      To clarify, does your code contain the necessary open and close speak tags used to identify SSML-enhanced text?

      If you're comfortable sharing a copy of your .story file, it would be helpful to have a closer look at what you've built out so far. Feel free to share it here in the discussion or privately through a support case.

      We'll delete it from our system as soon as troubleshooting is complete!

  • Hi Steven 

    Thanks for your reply and the example.

    I opened the file to check where the tag was used, and voila, it was working. (murphy!) Perhaps I wasn't listening closely enough before. The 's' sound is there, though seems to follow directly from the word before. 

    Could you advise how best to include a pause before the 's' sound and whether it is possible to lengthen the sound? So instead of a short s, a longer ssss sound (not separate s s s).

    I've attached the file. It is a very rough draft of a literacy assessment concept. See slide 1.3.

    Thanks!

    • JoseTansengco's avatar
      JoseTansengco
      Staff

      Hi Jessica,

      Happy to chime in!

      You can use the <break> tag to add a pause in your paragraph. Here's how. As for making the s longer, try this text which uses both the <break> and <prosody> tags to emphasize the letter S:

      <speak>
      Which letter makes the <break time="0.3s"/><prosody rate="x-slow">s</prosody> sound? Click on the letter.
      </speak>

      Hope this helps!

  • JessicaFritz's avatar
    JessicaFritz
    Community Member

    Hello!
    When I add SSML tags, they are read aloud by the narrator. I am using <speak> and </speak> before and after my text and the narrator says the word speak. What am I doing wrong?

  • Hi Jessica!

    Thanks for reaching out. Could you please share your .story file with us in this thread or privately in a support case. It would be helpful to see your setup and suggest modifications to the SSML tags if needed!

    Looking forward to hearing from you!