Thanks for reaching out and letting us know what you would like to use in the text to speech tool. This is not available at this time, but we are tracking a feature request for some additional functionality, such as speed control and emphasis.
I've added this conversation to the report as we track user impact and so that we can share any updates with you here.
I recently created an application that might address the need of adjusting the speed for Text to Speech in Storyline and the ability to add emphasis to words by using SSML tags. It’s a bit of a workaround though as you’ll have to go to the source that Storyline uses, Amazon Polly voices.
Here’s what you do:
Get an Amazon Polly account (yes, there is some cost involved but doesn’t seem that prohibitive) (https://aws.amazon.com/polly/)
Save your scripts as separate files (MS-Word or Text)
Download HeroVoice TTS from the Microsoft Windows Store (fully functioning 15-day free trial)
Encode your files with HeroVoice TTS – apply a global setting for speed and even comma duration so your files are consistent.
Select the voice you want – these are the same as you’ll find in Storyline today including Neural voices (which aren’t currently available in Storyline)
Load each audio file into Storyline
The supported SSML tags in Amazon Polly is a good resource as it will let you know which tags are supported for each voice type (Standard or Neural). The Neural voices are a higher quality voice and *don't* support the SSML emphasis tag.
co-signing this request as it would be very nice to add emphasis by placing * * around a word or something like that. I imagine emphasis could just be a slight tweak to the pitch with a short delay.
I've used Adobe Premiere Pro for most of my videos, but create the audio from text within Storyline and then export it. It'd be nice to adjust speed and pitch and also do both independently from one another.
+1 for this feature request. The ability to change the emphasis would be especially useful for words where the syllable stress/emphasis can actually alter the meaning of the word. E.g. "record" (noun) vs "record" (verb).
Also, as an Australian user, the American voices are unfortunately much more natural sounding than the Australian voices currently available in Storyline via Amazon Polly. Please incorporate the latest Neural Polly voices into Storyline for us non-American English-speaking users - and for our clientele, who find the American and UK voices annoying and sometimes difficult to understand.
My clients consistently ask for adjustments to speed control and emphasis. I'm hoping Articulate makes these upgrades to text-to-speech in the very near future.
We are also looking into alternatives for making TTS from Storyline into more natural sounding audio. We have found a very nice-sounding alternative, but I'm hoping Articulate will enhance the existing TTS rather than having to use another authoring tool.
Hi Sheri, what are you using as your nice-sounding alternative, please? Our Creative Editor is currently swamped with projects so I'm looking to see how I can do with Articulate for now!
I've recently secured a subscription to Microsoft Azure's text to speech tool. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. And the pricing... well let's just say I've been using it pretty heavily for the last month, and it's only costing us $1 so far!
You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind ofbe created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.
Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative.
14 Replies
Hi Christopher,
Thanks for reaching out and letting us know what you would like to use in the text to speech tool. This is not available at this time, but we are tracking a feature request for some additional functionality, such as speed control and emphasis.
I've added this conversation to the report as we track user impact and so that we can share any updates with you here.
I wanted to share some information about how we manage these feature requests as that may be helpful.
I also would like to see some additional functionality with the text to speech. Especially word emphasis.
I agree as well!
I agree. A way to emphasize a word would really help.
Add me to the list of people interested in work emphasis with TTS. Speed control sounds like a great feature too.
Hear hear to the above!
I recently created an application that might address the need of adjusting the speed for Text to Speech in Storyline and the ability to add emphasis to words by using SSML tags. It’s a bit of a workaround though as you’ll have to go to the source that Storyline uses, Amazon Polly voices.
Here’s what you do:
The supported SSML tags in Amazon Polly is a good resource as it will let you know which tags are supported for each voice type (Standard or Neural). The Neural voices are a higher quality voice and *don't* support the SSML emphasis tag.
co-signing this request as it would be very nice to add emphasis by placing * * around a word or something like that. I imagine emphasis could just be a slight tweak to the pitch with a short delay.
I've used Adobe Premiere Pro for most of my videos, but create the audio from text within Storyline and then export it. It'd be nice to adjust speed and pitch and also do both independently from one another.
+1 for this feature request. The ability to change the emphasis would be especially useful for words where the syllable stress/emphasis can actually alter the meaning of the word. E.g. "record" (noun) vs "record" (verb).
Also, as an Australian user, the American voices are unfortunately much more natural sounding than the Australian voices currently available in Storyline via Amazon Polly. Please incorporate the latest Neural Polly voices into Storyline for us non-American English-speaking users - and for our clientele, who find the American and UK voices annoying and sometimes difficult to understand.
My clients consistently ask for adjustments to speed control and emphasis. I'm hoping Articulate makes these upgrades to text-to-speech in the very near future.
We are also looking into alternatives for making TTS from Storyline into more natural sounding audio. We have found a very nice-sounding alternative, but I'm hoping Articulate will enhance the existing TTS rather than having to use another authoring tool.
@Sheri -
Just curious, what alternative did you find?
Hi Sheri, what are you using as your nice-sounding alternative, please? Our Creative Editor is currently swamped with projects so I'm looking to see how I can do with Articulate for now!
I've recently secured a subscription to Microsoft Azure's text to speech tool. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. And the pricing... well let's just say I've been using it pretty heavily for the last month, and it's only costing us $1 so far!
You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind of be created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.
Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative.