Text to Speech SSML tag

Mar 11, 2019

Hi,

I would like to know if it is possible to tune the articulates TTS voices using the SSML markup tags that are defined by Speech Synthesis Markup Language (SSML) Version 1.1, W3C Recommendation ?

or any other tags ?

pascal

Pinned Reply
Kelly Auner

Hi, everyone!

I have some great news to share. We just released another update for Storyline 360. In Update 83, we’ve included important fixes and new features!

One enhanced feature we’ve included:

Unlock new possibilities for text-to-speech audio. Use speech synthesis markup language (SSML) to adjust the speaking rate, modify pronunciation, emphasize words, add pauses, and more.

To take advantage of this update, launch the Articulate 360 desktop app on your computer, and click the Update button next to Storyline 360. You'll find our step-by-step instructions here!

34 Replies
John Morris

Yes, the voices in Storyline TTS do sound very much like the Polly voices.

I have used both the TTS feature in Storyline and Polly with SSML from the Amazon Polly website.  When using Polly, the only option I have found is to save the Polly speech as a file external to Storyline and import it using Insert Audio.

Storyline TTS is so darn convenient, being able to integrate with notes and to make almost instant updates.  For me it is not worth the hassle of using files from Polly.

If a client tells me "I don't like the way the voice says that word,"  I am kind of concerned, but not overly so, because the goal of the training can still be met.  If a client tells me "I don't understand that sentence at all," I am much more concerned, because the training is not working.  When this happens I will explore other options for making the speech more clear. Starting with rewording or using phonetic spelling in Storyline, but using another speech generator if I absolutely must.

I have had demonstrations with SMEs and stakeholders to show how fast and accurately I can work and make updates using Storyline TTS.  I point out that having a strange sounding word every now and then is a small price to pay against those benefits.

If there is a way to use SSML in Storyline, I would love to learn about it.

Thor Melicher

I see lots of different requests going on here, but I might have a solution for you, but it requires going to the source that Storyline uses, Amazon Polly.  To make things a bit simpler, I’ve created an application that addresses many of the things here:

  • Adjust the overall speed of your files with one setting
  • Adjust the overall pause duration for commas
  • Add your own SSML to get more finer nuanced, naturally sounding results and as necessary, correct the pronunciation of words
  • Neural voices
  • Batch process your files

Here’s what you do:

  1. Get an Amazon Polly account (yes, there is some cost involved but doesn’t seem that prohibitive) (https://aws.amazon.com/polly/)
  2. Save your scripts as separate files (MS-Word or Text)
  3. Download HeroVoice TTS from the Microsoft Windows Store (fully functioning 15-day free trial)
  4. Encode your files with HeroVoice TTS – apply a global setting for speed and even comma duration so your files are consistent.
  5. Select the voice you want – these are the same as you’ll find in Storyline today including Neural voices (which aren’t currently available in Storyline)
  6. Load each audio file into Storyline
Drago Ivanov

I noted that this discussion is from 2 years ago, yet no SSML capability in SL360. I also use Lectora and they had SSML option since the feature was introduced (looks like Amazon Polly as well).

For the time being, I will generate the voices in Lectora and import them in SL360, at least I have this option.

Kelly Auner

Hi, everyone!

I have some great news to share. We just released another update for Storyline 360. In Update 83, we’ve included important fixes and new features!

One enhanced feature we’ve included:

Unlock new possibilities for text-to-speech audio. Use speech synthesis markup language (SSML) to adjust the speaking rate, modify pronunciation, emphasize words, add pauses, and more.

To take advantage of this update, launch the Articulate 360 desktop app on your computer, and click the Update button next to Storyline 360. You'll find our step-by-step instructions here!

Karen Loftus

Kelly, I'm trying to use the SSML option using today's update (13DEC23).

Once I paste in the text, and choose a voice (Danielle), how do I need get the SSML characters to be applied?  I added a few in manually, but that doesn't seem to work.

Trying it a different way, seems like I might need to start with <speak> and end with </speak>.  When I did that the other SSL characters look like they took, but the audio just "said" those things.

What am I missing?

These voices are all from Amazon Polly, right?

Annie Kim

Hi Karen, I'm not sure how you got into that state, but I was able to get it working by doing the following:

  1. Deleted the opening and closing speak tags, saved, and then re-added them. When you try to save you'll get an error message.
  2. Deleted the space so "amazon: effect" is now "amazon:effect"
  3. Added the closing tag "</amazon:effect>" for the whisper effect at the end of the sentence
  4. Changed the voice to a standard voice in another language because the whisper effect doesn't work with neural voices

I attached the modified project. Hope this helps!

Comprehend eLearning

Hi, I've updated and am testing out. I wanted to change the pitch of a voice.

<speak>
<prosody pitch="x-low">This text has extra low pitch</prosody>
</speak>

When try the above, and click Insert, get error message that need to verify SSML tags are correct and supported for the selected voice. I think tags are correct for pitch. When I look at the Articulate SSML page, pitch appears to work for only Standard voices. The dropdown in my Insert Text-to-Speech, appears to only show Neural voices. How do I switch to Standard voices?

Also, just to confirm, you will only hear the changes to the voice when you publish? not in Preview?

Thanks!

Jose Tansengco

Hello Comprehend eLearning,

Happy to help!

You can find a list of the Standard Voices here. You'll want to make sure that you are using a voice from this list when using the "<prosody>" tag. You can also find more information on how to add values to the tag here for your reference. 

And to address your follow-up question, you will hear the changes to the voice during preview as well, so no need to publish to hear the differences.

Jose Tansengco

Hello Comprehend eLearning,

Yes, that is correct. Currently, the "<prosody>" tag can only be used on the standard voices listed in the article that I shared. Unfortunately, English (USA) only exists as a neural voice but you're welcome to raise a feature request here for additional standard voices.

Richard Fouchaux

I was very excited to try it, but very disappointed that it simply doesn't work here. The voices read the SSML code, which has no effect on the output. The syntax highlighting is there, so that part of the update took place. Based on your stellar record I install AS360 updates almost immediately without fear, but the neural voices have never impressed me — I sense now that they possibly aren't properly updated. I didn't know about a workaround until today, but I'm creating new audio and changing old with more than just a space at the end, and I've had to use/avoid the same voices as ever.

So summarize: SSML is not working after installing the latest update. Despite syntax highlighting, Standard or Neural voices read the tags out loud. In light of information gleaned here today I suspect my Neural voices have not updated properly, perhaps across this and several previous updates.

John Morris

I am having the same issue as Richard F above.  The voice reads the tags.

When I download the file and entitled "ssml_support.story" and run it, the voice reads the tags aloud.

When I download the file entitled "modified_ssml_suppoort.story" the tags appear to be working, they affect the voice performance instead of being read aloud.

Why does one work and not the other?  What is the difference from one to the other? I have read thru the thread but I am not sure I understand.

Is there a reference somewhere that we can go and learn about this stuff?