Thanks for reaching out and letting us know what you would like to use in the text to speech tool. This is not available at this time, but we are tracking a feature request for some additional functionality, such as speed control and emphasis.
I've added this conversation to the report as we track user impact and so that we can share any updates with you here.
I recently created an application that might address the need of adjusting the speed for Text to Speech in Storyline and the ability to add emphasis to words by using SSML tags. It’s a bit of a workaround though as you’ll have to go to the source that Storyline uses, Amazon Polly voices.
Here’s what you do:
Get an Amazon Polly account (yes, there is some cost involved but doesn’t seem that prohibitive) (https://aws.amazon.com/polly/)
Save your scripts as separate files (MS-Word or Text)
Download HeroVoice TTS from the Microsoft Windows Store (fully functioning 15-day free trial)
Encode your files with HeroVoice TTS – apply a global setting for speed and even comma duration so your files are consistent.
Select the voice you want – these are the same as you’ll find in Storyline today including Neural voices (which aren’t currently available in Storyline)
Load each audio file into Storyline
The supported SSML tags in Amazon Polly is a good resource as it will let you know which tags are supported for each voice type (Standard or Neural). The Neural voices are a higher quality voice and *don't* support the SSML emphasis tag.
co-signing this request as it would be very nice to add emphasis by placing * * around a word or something like that. I imagine emphasis could just be a slight tweak to the pitch with a short delay.
I've used Adobe Premiere Pro for most of my videos, but create the audio from text within Storyline and then export it. It'd be nice to adjust speed and pitch and also do both independently from one another.
+1 for this feature request. The ability to change the emphasis would be especially useful for words where the syllable stress/emphasis can actually alter the meaning of the word. E.g. "record" (noun) vs "record" (verb).
Also, as an Australian user, the American voices are unfortunately much more natural sounding than the Australian voices currently available in Storyline via Amazon Polly. Please incorporate the latest Neural Polly voices into Storyline for us non-American English-speaking users - and for our clientele, who find the American and UK voices annoying and sometimes difficult to understand.
My clients consistently ask for adjustments to speed control and emphasis. I'm hoping Articulate makes these upgrades to text-to-speech in the very near future.
We are also looking into alternatives for making TTS from Storyline into more natural sounding audio. We have found a very nice-sounding alternative, but I'm hoping Articulate will enhance the existing TTS rather than having to use another authoring tool.
Hi Sheri, what are you using as your nice-sounding alternative, please? Our Creative Editor is currently swamped with projects so I'm looking to see how I can do with Articulate for now!
I've recently secured a subscription to Microsoft Azure's text to speech tool. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. And the pricing... well let's just say I've been using it pretty heavily for the last month, and it's only costing us $1 so far!
You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind ofbe created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.
Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative.
I just wanted to share some news about Storyline 360 Update 80. This update might be interesting for you since you’ve explored other options to improve the quality of TTS. In update 80, we have taken advantage of Amazon Polly’s neural text to speech feature. You will see better versions of most of the voices when inserting TTS audio. These will show up in the same place as the standard voices, under “Neural Voices”. These are voices that sound more natural and human-like and are considerably higher in quality compared to the older standard TTS voices.
A list of these voices can be found here. Updating Storyline 360 to the latest version is super easy, here is the guide in case anyone needs it.
We will continue to keep tabs on requests to support ways to control emphasis, speaking rate, inserting silence, pronunciation and SSML support in general. Please let me know of any feedback (good or bad), around this enhancement and we’ll be happy to pass it along to our dedicated team of engineers.
All the best, Michael Marcos Customer Support Product Liaison
Thank you Michael. Appreciate your efforts to integrate at least one neural TTS product in Storyline. I have tested the one Australian voice now available in Storyline, and while it is significantly better than the robotic sounding standard voices, it doesn't quite meet our business needs. Azure TTS voices are much more natural and allow pronunciation to be corrected using the IPA, not just SSML. Here is a demo video I created, showing the use case for our particular business needs. I will of course recommend the built-in neural voice for users within our business that have very short production timelines, but for everyone else, I expect they will prefer the Azure voices.
Thank you Brooke for the video, something to think about for sure!
And hello everyone, just wanted to share with y'all that Update 83 now allows for SSML support in TTS.
Unlock new possibilities for text-to-speech audio. Use speech synthesis markup language (SSML) to adjust the speaking rate, modify pronunciation, emphasize words, add pauses, and more.
To take advantage of this update, launch the Articulate 360 desktop app on your computer, and click the Update button next to Storyline 360. You'll find our step-by-step instructions here!
Storyline does not provide standard voices, therefore the tag is no use. I've slowed voices to 85% for a few words to try to provide emphasis. It helps a little.
20 Replies
Hi Christopher,
Thanks for reaching out and letting us know what you would like to use in the text to speech tool. This is not available at this time, but we are tracking a feature request for some additional functionality, such as speed control and emphasis.
I've added this conversation to the report as we track user impact and so that we can share any updates with you here.
I wanted to share some information about how we manage these feature requests as that may be helpful.
I also would like to see some additional functionality with the text to speech. Especially word emphasis.
I agree as well!
I agree. A way to emphasize a word would really help.
Add me to the list of people interested in work emphasis with TTS. Speed control sounds like a great feature too.
Hear hear to the above!
I recently created an application that might address the need of adjusting the speed for Text to Speech in Storyline and the ability to add emphasis to words by using SSML tags. It’s a bit of a workaround though as you’ll have to go to the source that Storyline uses, Amazon Polly voices.
Here’s what you do:
The supported SSML tags in Amazon Polly is a good resource as it will let you know which tags are supported for each voice type (Standard or Neural). The Neural voices are a higher quality voice and *don't* support the SSML emphasis tag.
co-signing this request as it would be very nice to add emphasis by placing * * around a word or something like that. I imagine emphasis could just be a slight tweak to the pitch with a short delay.
I've used Adobe Premiere Pro for most of my videos, but create the audio from text within Storyline and then export it. It'd be nice to adjust speed and pitch and also do both independently from one another.
+1 for this feature request. The ability to change the emphasis would be especially useful for words where the syllable stress/emphasis can actually alter the meaning of the word. E.g. "record" (noun) vs "record" (verb).
Also, as an Australian user, the American voices are unfortunately much more natural sounding than the Australian voices currently available in Storyline via Amazon Polly. Please incorporate the latest Neural Polly voices into Storyline for us non-American English-speaking users - and for our clientele, who find the American and UK voices annoying and sometimes difficult to understand.
My clients consistently ask for adjustments to speed control and emphasis. I'm hoping Articulate makes these upgrades to text-to-speech in the very near future.
We are also looking into alternatives for making TTS from Storyline into more natural sounding audio. We have found a very nice-sounding alternative, but I'm hoping Articulate will enhance the existing TTS rather than having to use another authoring tool.
@Sheri -
Just curious, what alternative did you find?
Hi Sheri, what are you using as your nice-sounding alternative, please? Our Creative Editor is currently swamped with projects so I'm looking to see how I can do with Articulate for now!
I've recently secured a subscription to Microsoft Azure's text to speech tool. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. And the pricing... well let's just say I've been using it pretty heavily for the last month, and it's only costing us $1 so far!
You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind of be created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.
Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative.
Hi Brooke, Sheri and Tina,
I just wanted to share some news about Storyline 360 Update 80. This update might be interesting for you since you’ve explored other options to improve the quality of TTS. In update 80, we have taken advantage of Amazon Polly’s neural text to speech feature. You will see better versions of most of the voices when inserting TTS audio. These will show up in the same place as the standard voices, under “Neural Voices”. These are voices that sound more natural and human-like and are considerably higher in quality compared to the older standard TTS voices.
A list of these voices can be found here. Updating Storyline 360 to the latest version is super easy, here is the guide in case anyone needs it.
We will continue to keep tabs on requests to support ways to control emphasis, speaking rate, inserting silence, pronunciation and SSML support in general. Please let me know of any feedback (good or bad), around this enhancement and we’ll be happy to pass it along to our dedicated team of engineers.
All the best,
Michael Marcos
Customer Support Product Liaison
This post was removed by the author
Thank you Michael. Appreciate your efforts to integrate at least one neural TTS product in Storyline. I have tested the one Australian voice now available in Storyline, and while it is significantly better than the robotic sounding standard voices, it doesn't quite meet our business needs. Azure TTS voices are much more natural and allow pronunciation to be corrected using the IPA, not just SSML. Here is a demo video I created, showing the use case for our particular business needs. I will of course recommend the built-in neural voice for users within our business that have very short production timelines, but for everyone else, I expect they will prefer the Azure voices.
Thank you Brooke for the video, something to think about for sure!
And hello everyone, just wanted to share with y'all that Update 83 now allows for SSML support in TTS.
Unlock new possibilities for text-to-speech audio. Use speech synthesis markup language (SSML) to adjust the speaking rate, modify pronunciation, emphasize words, add pauses, and more.
To take advantage of this update, launch the Articulate 360 desktop app on your computer, and click the Update button next to Storyline 360. You'll find our step-by-step instructions here!
Best,
Mike
This post was removed by the author
Hi Mike, it seems like Storyline does not support the <emphasis> tag, is that correct?
the <emphasis>--tag is only supported for standard voices - not for neural voices
https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html#emphasis-taghttps://docs.aws.amazon.com/polly/latest/dg/supportedtags.html
Storyline does not provide standard voices, therefore the tag is no use. I've slowed voices to 85% for a few words to try to provide emphasis. It helps a little.