Forum Discussion
New Neural Text to Speech Voices from Amazon Polly
Storyline 360 uses standard voices from Amazon Polly for it's text to speech functionality. I imagine they use their API for this. This has been great for prototyping sides and getting the timing right. Occasionally it can be good for scenarios if you can't afford to hire other voices. Recently, Amazon has developed a new version of their voices called Neural Voices that use better algorithms for synthesizing voices. I would love to see Articulate let us access these new voices directly from Storyline. You can use them by accessing Amazon Polly if you have an Amazon account and downloading the audio and importing it manually into your timeline. I have created a quick demo to let you hear the difference.
https://360.articulate.com/review/content/a6e4ce5a-f982-42aa-8cd1-76e32b2204cd/review
You can read more about it here: https://aws.amazon.com/blogs/aws/amazon-polly-introduces-neural-text-to-speech-and-newscaster-style/
Hi, everyone!
I have some great news to share. We just released another update for Storyline 360. In Update 83, we’ve included important fixes and new features!
One enhanced feature we’ve included:
Unlock new possibilities for text-to-speech audio. Use speech synthesis markup language (SSML) to adjust the speaking rate, modify pronunciation, emphasize words, add pauses, and more.
To take advantage of this update, launch the Articulate 360 desktop app on your computer, and click the Update button next to Storyline 360. You'll find our step-by-step instructions here!
- RichardMaranta1Community Member
Thor's application is a really useful tool. I've tried it out and if you want to get better quality Neural voices in your project, this is the easiest way to do it!
- DivyanshuPandeyCommunity Member
Some startups are also offering good Neural voices, Murf is one of them with a good selection.
I used their service because my clients like good voice-overs, but don't like the extra time it takes to record a real person and don't want to pay for voice talent
- ThorMelicher-b5Community Member
Bob,
WaveNet for Chrome is the Chrome Plugin that Anthony was most likely referring to - you can get it here:
https://chrome.google.com/webstore/search/wavenet%20for%20chrome?hl=en-US
- ThorMelicher-b5Community Member
@Mitchell,
Looks like Murf is using a mix of providers? At the very least, their 'Toby-UK' voice is Amazon Polly's 'Brian-UK' voice.
- DivyanshuPandeyCommunity Member
@Thor, now that you mentioned it, seems like you are right. Though it seems they have many other good voices compared with Amazon Polly. Again might be a mix of providers.
@Anthony yes I tried Wellsaid Labs as well before Murf but the pricing is on a higher side for just 4-15 voice options as against 30+ in English being offer by Murf
- ThorMelicher-b5Community Member
And to throw in another group that's similar to Descript (http://www.descript.com) is Replica Studios - (https://replicastudios.com/)
What's impressive with both companies is that how little is needed to be recorded vs. what Microsoft offered just a few years ago.
Personally, I haven't tried either one yet, but both look promising.
- BrookeOttleyCommunity Member
I've recently secured a subscription to Microsoft Azure's text to speech service. After testing a variety of text-to-speech tools with Australian accents, this is the one we settled on. It has a huge variety of languages and accents, in masculine and feminine voices, and the pronunciation is impeccable. Even better than Google and Amazon's neural voices. I tested its pronunciation of some lengthy medications (currently working on some health-related eLearning videos) and a pharmacist on our team confirmed the TTS got it right, the first time. I've been using it pretty heavily for the last month, and it's only costing us a dollar so far!
You can use IPA and SSML to correct pronunciation, and can even upload a custom lexicon if there's particular words used throughout your transcripts that require correction. E.g. names of local towns, and in our case, Aboriginal nations. Emphasis can kind of be created by using the web-based TTS tool to increase the volume and speed/rate that particular words are spoken. However, for some reason volume changes can only be applied to entire sentences. Here is a screenshot of how I've used the TTS tool to create emphasis on the words "will not". I have then done some basic audio trimming and stitched the sentence together in Storyline.
Our eLearning participants really hate listening to American or British narrators, and they hate the fake, robotic sounding Australian Amazon Polly voice that's built into Storyline. Some have said they would rather mute the entire video and read the captions instead. Us non-American eLearning developers really need an integrated, high quality neural voice within Storyline. But until Articulate make this a priority, the Microsoft Azure voices are a great alternative
- ShawnAbsolamCommunity Member
It seems for English(US) that only Kevin, Ruth and Stephen are neural voices, the others seem to be using the standard profile.
Hello Shawn,
All the English(US) voices are neural voices in Update 80. By any chance were you updating a narration created in Update 79 or earlier? If so, try this, before you hit done, try making a change to the text (script), like putting an extra space at the end before hitting the Update button in the Insert Text-To-Speech dialog.
Please let me know how it goes,Best,
Michael Marcos
Customer Support Product Liaison
Articulate Global LLC- ShawnAbsolamCommunity Member
Hi, I was just comparing some audio created in an older course, using the same source text in Update 80.