14 Replies
Leslie McKerchie

Hi Christopher,

Thanks for reaching out and letting us know what you would like to use in the text to speech tool. This is not available at this time, but we are tracking a feature request for some additional functionality, such as speed control and emphasis.

I've added this conversation to the report as we track user impact and so that we can share any updates with you here.

I wanted to share some information about how we manage these feature requests as that may be helpful.

Thor Melicher

I recently created an application that might address the need of adjusting the speed for Text to Speech in Storyline and the ability to add emphasis to words by using SSML tags.  It’s a bit of a workaround though as you’ll have to go to the source that Storyline uses, Amazon Polly voices.  

Here’s what you do:

  1. Get an Amazon Polly account (yes, there is some cost involved but doesn’t seem that prohibitive) (https://aws.amazon.com/polly/)
  2. Save your scripts as separate files (MS-Word or Text)
  3. Download HeroVoice TTS from the Microsoft Windows Store (fully functioning 15-day free trial)
  4. Encode your files with HeroVoice TTS – apply a global setting for speed and even comma duration so your files are consistent.
  5. Select the voice you want – these are the same as you’ll find in Storyline today including Neural voices (which aren’t currently available in Storyline)
  6. Load each audio file into Storyline 

The supported SSML tags in Amazon Polly is a good resource as it will let you know which tags are supported for each voice type (Standard or Neural).  The Neural voices are a higher quality voice and *don't* support the SSML emphasis tag.

Mike Adamski

co-signing this request as it would be very nice to add emphasis by placing * * around a word or something like that. I imagine emphasis could just be a slight tweak to the pitch with a short delay.

I've used Adobe Premiere Pro for most of my videos, but create the audio from text within Storyline and then export it. It'd be nice to adjust speed and pitch and also do both independently from one another.

Brooke Ottley

+1 for this feature request. The ability to change the emphasis would be especially useful for words where the syllable stress/emphasis can actually alter the meaning of the word. E.g. "record" (noun) vs "record" (verb).

Also, as an Australian user, the American voices are unfortunately much more natural sounding than the Australian voices currently available in Storyline via Amazon Polly. Please incorporate the latest Neural Polly voices into Storyline for us non-American English-speaking users - and for our clientele, who find the American and UK voices annoying and sometimes difficult to understand.

Brooke Ottley

I've recently secured a subscription to Microsoft Azure's text to speech tool. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. And the pricing... well let's just say I've been using it pretty heavily for the last month, and it's only costing us $1 so far!

You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind of be created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.

Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative.