Text-to-Speech is awesome but has one serious flow.

I am very impressed with the Text-to-Speech engine now included in Storyline 360.  I previously had to use a voice-over artist and this was costly and frustrating when you wanted to make minor changes to the dialogue.  

 

Recording and modifying dialogue is a breeze with Text-to-Speech - except for one major flaw!  It is not possible to include any "control" variables in your Text-to-Speech dialogue.  By this I mean something like [Pause 5] or [emphasis].  I have read the discussions around getting the best out of Text-to-Speech and can manipulate "emphasis" to a certain degree using comma, etc.

 

The big frustration comes with inserting pauses.  For this, I need to edit the Text-to-Speech sound track and insert the pauses using the Silence function.  This is fine until you need to make any modification to the Text-to-Speech (such as adding a comma to improve emphasis), in which case all the silences I have inserted are deleted and I have to start over again.  

 

As much as Text-to-Speech is saving me time and money, it is also wasting a huge amount of time with the rework of the Silences.  It would be great if Articulate include the use of control variables i.e. [Pause 5] and [Emphasis] in the Text-to-Speech engine.  I could include these variables directly into the Text-to-Speech dialogue.

15 Replies
Gawie Bing

Hi Alyssa.  In my quest to find an alternative text-to-speech engine, that overcomes my frustrations, I came across Amazon Polly.

How interesting, as Amazon Polly appears to be the exact same speech-to-text engine that Articulate is using!  What is even more interesting is that Amazon Polly already supports the in-text tags that I am looking for in their SSML editor.

If Articulate is already hooking into Amazon Polly, why not just enable SSML editing and my problems are solved?

Robert Cummins

Regardless of SSML editing, at least increase the default pause between sentence breaks.  Everything is simply run together no matter what punctuation you add.  Maybe Mandarin is using a different symbol, if so, what is that symbol to indicate to text to speech to pause a little between the start of the next sentence?

SSML editing would be nice, but would need to automatically be removed in the CC's generated, otherwise there is just as much work fixing the CC's as adding the SSML.

Chris Jamerson

I would like this as well. I'm converting A LOT of PowerPoints with text-to-speech in the notes section. It would save so much time if I could add a pause at the end of each slide without manually adjusting the timeline.  Also, can you add the ability to automatically add text-to-speech when importing slides?  These two items alone would save me many many hours.

emily gill

A further feature request would be the ability to "teach" the talk to text certain characters; for example, I have a course in which the phrase "complaint/inquiry" is heavily used, as that is the way the process was written and defined by the client. It would be nice to be able to get the text to learn that the / does not need to be spoken, but can be read as "and or" globally, rather than having to go into each text-to-speech to update, since the find/replace feature does not read captions either.

Lynn Adinolfi

So all of this just happened to me and I am very disappointed to find out that this has been an issue for well over 9 months with no resolution.

I am enjoying this product but this is a real downfall. I just did a 5 minute video with pauses and I had to change one word, then LOST the 90 minutes worth of editing.....

If there is any resolution to this please post....

Lauren Connelly

Hi Lynn!

We have many ideas in this thread that have been shared with our team. When we feel like these changes have reached perfection, we'll update you here!

Secondly, I'm so sorry you've lost 90 minutes worth of editing! I'd like our Support Engineers to take a deeper look into your file! Please use this link to submit a case with them directly.

Scarlett Brooks

I have had a lot of success with inserting audio as small snippets, so I can manipulate them easily. For example, each sentence is usually its own file. If I have something like a list and want not only to manipulate the timing, but also coordinate interactions with the narration, using smaller snippets of sound makes that a breeze.

To maximize the benefit, I copy and paste the narration from the story board...

Hope this helps!

Joe Marino

When I encounter this problem, I type a TTS script with adjusted spellings so the TTS engine can record it as it is supposed to sound. What I place in the NOTES field is written as it is intended to be written, but the TTS engine has already recorded the speech as I want it to sound. (Example: "Dr. Smythe" should be pronounced the same as "Dr. Smith", but the TTS engine pronounces it with a "hard Y". So I give the TTS engine the name "Smith" and after it is recorded, I correct the spelling)

Thor Melicher

I see several different challenges going on in this post but I might have a solution for you?  It requires going to the source that Storyline uses, Amazon Polly.  To make things a bit simpler, I’ve created an application that addresses many of the things here:

  • Adjust the overall speed of your files with one setting
  • Adjust the overall pause duration for commas
  • Add your own SSML tags to get more finer nuanced, naturally sounding results and as necessary, correct the pronunciation of words
  • Neural voices
  • Batch process your files

 Here’s what you do:

  1. Get an Amazon Polly account (yes, there is some cost involved but doesn’t seem that prohibitive) (https://aws.amazon.com/polly/)
  2. Save your scripts as separate files (MS-Word or Text)
  3. Download HeroVoice TTS from the Microsoft Windows Store (fully functioning 15-day free trial)
  4. Encode your files with HeroVoice TTS – apply a global setting for speed and even comma duration so your files are consistent.
  5. Select the voice you want – these are the same as you’ll find in Storyline today including Neural voices (which aren’t currently available in Storyline)
  6. Load each audio file into Storyline