Dec 07, 2023

About Text to Speech.

When creating the translated version of a slide, How do I generate TTS that can easily be synced by someone who does not speak that language?

This is the one method my team came up with:

When generating TTS out of the slide notes, we need to split phrases into different audio files while preserving the tts script in the slide notes, and we need to generate different TTS audio for each slide layer while preserving the tts script in the slide notes.

We need to do both of these things:
1. Keep the TTS script in the Notes (because that is where our Language Editor does her job when exporting the storyfile to MsWord), and
2. split the TTS output into different audio events in the timeline with the purpose of syncing each phrase to specific timeline events such as the slide text boxes fading in/out at specific moments. This becomes more specific when we translate the TTS script to 16 languages. Each audio phrase is numbered, and our editors use the phrase number to know its meaning in languages they don´t speak. If the TTS is a single audio for the whole slide, our editors can't properly sync the translated TTS to the fade-in times of the onscreen elements.

On top of that, we need to generate different TTS audio for each slide layer in a slide, and -again- I need to do this from the slide notes. Is there a way to separate the slide notes so that each layer has its own set of notes? This way, I could generate different TTS audio events for each layer of a slide.

In the attached storyfile, you´ll see some examples in 4 different languages. The timeline sync is lost when changing from TTS in English to other -lenghtier- languages, such as German. If we could specify the line(s) of text in the slide notes that apply to the TTS audio within specific slide layer(s), we could separate fade-in events within different layers, which would solve the sync issue between languages. 

Any suggestions are more than welcome.
Thank you!

