Text to speech in Storyline lessons

Oct 25, 2012

I need to insert some accessibility features in learning lessons made by Storyline, one above all, the capability to read the text with a text to speech tool. Does anyone have some suggestions? I search the net but I didn't find anything. Thanks...

30 Replies
Anna Vilarnau

Hi Maurizio,

You may be interested in trying iSpeech Text To Speech API: http://www.ispeech.org/api

It's really easy to integrate, so you can add TTS capabilities totally transparent to the end user. There are over 40 voices available in 26 different languages, and the quality of the voices is top notch. You can test all the voices here: http://www.ispeech.org/text.to.speech

If you are interested let me know, I can help you. My email address is anna [at] ispeech [dot] org

Best regards,


Roger Mepham

I too have looked at speech synthesis for e-learning. The Captivate NEO voices in  captivate 5 and 6 were usuable, in particular the female voice was ok ish. But many clients throw their hands up in horror when you mention artificial speech.  I just did some googling and came across http://www.naturalreaders.com which seems to have a solution with a choice of Acapela voices (the UK Graham one sounds reasonable) for around £100 or so.

I see that Storyline outputs all its text as a word doc file so it should be possible to do a bulk conversion.

But in truth none of the voices sound convincing to my ears and you lose all the nuances and word stresses  that adding voice to e-learning should provide anyway.

If anyone has any suggestions I'd love to hear them, meanwhile Maplin do a great USB Podcast mike for £40 - sigh.

Geert De Rycke

Ciao Maurizio,

Both applications are not free, they are not very expensive 20$-30$ (http://www.nextup.com)

The voices we use are about the same price.

The procedure is very simple.

You enter your text into their editor

Listen to how it reads out the text

if you're happy then you can save it as an .WAV or an MP3-file.

StoryLine & Articulate can import tyhese file (slide/slide)


Mark Mulkerin

I can't recommend a text to speech option, but the question got me thinking about how to post process something that was generated by tts.  After a few minutes of playing with MorphVOX and Adobe Audition, it seems like you can apply some vocal effects that take the computer generated edge off.  Has anyone else given this a try and if so, any thoughts?  Sadly, I don't know if this would help Maurizio out if he needs to do text to speech on the fly in storyline rather than embedding .wav or mp3.



Maurizio Mattioli

Thank you all for the very interesting answers. I realised that I didn't explain in a very clear manner what I need. I try to fix: I would like to insert a button inside each page of the Storyline lesson to make a text to speech on the fly when the user click on the button. Only if it's impossible, I would consider the solution of exporting the texts, using a tts to produce an mp3 file of each text and importing the audio files in Storyline.

That said, I know that it's almost impossible to realize tts audio files that seems real human voices (I already worked with Loquendo, an Italian company acquired last year by Nounce, and Acapela:  with a very hard work you can reach a very similar result. But it's not worth the hassle: it's less expensive record real voices). Nevertheless I think that it's better than nothing and it could be a useful service for users with problems like dyslexia...

Two other points:

- I'm (unfortunately ) Italian; so I need tts with italian voices

- the learning lessons are for commercial use

Thank you all.



joann lynch

I'm looking for a bit of research on the effectiveness of text to speech vs voice narration. There are many reasons why to use one vs other (inflection, reduced effort on edits, etc) and I know we all have our own opinions but I'm looking for some scientific backing. Any studies done that you are aware of?

Wim van den Bosch

My experience with TTS: (not a scientific backing

Having tried various tools, including the aforementioned textaloud. My first conclusion would be that the best software (writer) to be used with voices is "Loquendo TTS Director", this software will give you complete control on the voices. Via speech tags, you can manipulate speed, pitch, duration, insert various pauses etc. You can make your own dictionary for words you would like to have pronounced in a customized way, it's also possible to have words spelled, or say number in time/date formats.

The voices available from Loquendo (version 7+) are the #1 I have seen/used. AT&T Natural Voice(s) would be  #2 and Acapela Telecom voices #3

My 2 cents:

The pros of using TTS vs Voice Narration:

  • Though time consuming, Through the use of tags inside Loquendo TTS Director it's possible to read any text as natural as a human...and "accent less" (Received Pronunciation)
  • Some text are instantaneously read very naturally without having to using the time consuming tags. Depending often on the voice used. e.g. Some text are read better by Simon than by Dave. (often related to the intonation and/or stress given to a specific word....but like I said in the point above...if you have time you can change all that by inserting tags)
  • If you simply want to use TTS to include accessibility options...than Director or TextAloud will both be suffice and this would save you enormous amount of time compared to Voice Narration.
  • Loquendo has an SDK and API for use with various devices...and this is what Maurizio should look into to realize his needs.

The cons:

  • You can guess it already: To make sure that any text sounds 100% naturally, you will need to make use of tags....and this is 1. time consuming and 2. the learning curve is steep. The use of a macro program..(to remember multiple ctrl+c, or assign hotkeys for tags) could help you speeding up the process. However, since you don't want words to be pronounced similarly in ever text....you will have to constantly adjust tags....or use the Lexicon Manager inside Loquendo to create different dictionaries for different text :( + :)
  • Actually the above is the only...problem.....but a huge one though! Once other issue that I have encountered is that Nuance has been very busy buying up companies that offered TTS....although nuance specializes in STT (Speech to text)...I believe that they are, or will be dominating the market when it comes to STT & TTS.....so, before buying anything, check on the main site whether Nuance has not recently acquired the company. 

I don't mind that a professional company like Nuance has been buying TTS scripts/programs/companies.....one problem though....They do not do anything with it as of yet. I looked through their offered software solutions, but no sign of TTS:(

Maurizio already mentioned about Loquendo and Nuance...I hope my contribution can help a bit more.



Wim van den Bosch


Most programs (expect Loquendo TTS which will only see LOquendo voices) can use voices from other companies. E.g Text Aloud will see voices off  Acapel Telecom, AT&T, Microsoft and Loquendo

So, why does it matter that Loquendo is not in business anymore? You can still use their voices and the program.

I have not seen a more complete TTS application than Loquendo Director + SDK + API....and I would say that If you can't do it what you try to do with Loquendo...than I think it will be very hard to accomplish with another company.


2 things to add to my previous post:

1. http://www.loquendo.com/en/demo-center/interactive-tts-demo/  (simple demo of Loquendo)

2. Looking at the embedded expressions in Loquendo, which sound absolutely natural.....I would suspect that in a few years....we would just have 1.000.000 common phrases, expressions and word combination used very often...and with this we could make very easily, quickly and naturally sounding interactions for use in the quizzes.

I have a software called Iclone...and Crazy Talk..which both support TTS and the voices of AT&T and Loquendo etc..are automatically detected...it would be nice if Articulate Storline would have this capability (of directly incorporating  TTS) as well.

I have copied the text above into an Loquendo TTS Director and converted it into an MP3 file. No tags used and no alteration...so thequality isn't always great....but with some (non-time consuming) clean up and minimal tagging....it could sound a lot better

Monica Moritz

Hello everyone! I see the discussion above and that it is two years since the contributions were made.

I would like to do the exact same thing as Maurizio Mattioli talk about. So I wonder if you know, if there are any newer suggestions on how to get the text that you type in Articulate 2 to be a woice (TTS).

Ron Starc

The current best text to speech software is Text Speaker. It has customizable pronunciation, reads anything on your screen, and it even has talking reminders. It is great for learning languages as it highlights the words as they are being read. The bundled voices are well priced and sound very human. Voices are available in English, French, Italian, Spanish, German, and more. Easily converts blogs, email, e-books, and more to MP3 or for listening instantly.

Danny Stefanic

Hi Monica,

You may also want to look at a solution that requires no audio files or streaming bandwidth and uses the highest quality text to speech voices available in browsers today, there is a thread about it here

Let me know if you have any questions, there is an add-on specifically for Storyline2.

Henrik Clausen

Currently, we're using Adobe Captivate, which has 5 pretty decent voices and lets us manage all TTS as a single project file, from which you export MP3 or WAV file. The main drawback of TTS is that it's monotone, the software obviously has no clue about giving emphasis to specific parts of a sentence. For projects on a budget, this is viable for production use. Other projects that have a lot of important information in the spoken text need professional narration.

My wishlist for Storyline 3 includes integrated TTS. That would make my life a lot easier!