Export closed captions in bulk?

Oct 09, 2017

Is it possible to export closed captions in bulk? I have a course with 9 scenes, each scene at least 10-20 slides and/or layers. As you can imagine, exporting the captions slide by slide, layer by layer, is brutal.

73 Replies
James Bertelsen

I considered writing a script that would take the XML as input, extract the text from each vtt file, and associate the extracted text with the name of the media file as an output report. But it seems to me, the result would just be a listing of original media file names and their captions, in no particular order.

How would that be helpful, as opposed to just having a transcript of all captions, organized by slide, in the order in which they actually appear on the slide and in the course?

Peter Fitzgerald

That's a good question and it looks like my question about adapting the script could have been better worded. The ultimate goal here is to extract all of the timestamped VTT files named as they appear in the Media Library without the need to export each one individually.

Renaming the project from .story to .zip allows you to retrieve all the audio and VTT files at once (as though you went through an clicked "Export" on each Media Library item), but each file has a new generated file name.

The theme.xml.rels file will allow you to match up the generated names of the audio file with the generated names of the VTT files.

But the final step here would be to match up the generated file names with the original file names as they appear in the Media Library.

Tiffany Dorris
Thor Melicher

Hi, Steve - Not sure what you mean - would you mind enlightening us? 

One other thing that I found since my last post is the Media Library within Storyline does make it a tad faster to export your captions, still one at a time but it's not necessary to go one slide at a time, either:

From the menu bar,

  1. Click View | Media Library
  2. Select Video
  3. *Next to Captions, click Edit Captions
  4. Click Export...

*When clicking Edit Captions you would think it would start editing captions but instead you get the following choices: Edit, Replace, Export, Delete, and Apply to All.  Not sure what Apply to All means as it is also shown for Alt Text, too.

 

@Stoyline - can you not add a select box to select the audio/video in the media library you want and then export captions to all of them? It seems like it would be a simple add-on since the media library already groups all the video and audio together and even identifies which ones have captions. They could export as a CSV or even plain text. Video would have to be done separately from audio - but still a lot better than the current option. 

Thor Melicher
Peter Fitzgerald

That's a good question and it looks like my question about adapting the script could have been better worded. The ultimate goal here is to extract all of the timestamped VTT files named as they appear in the Media Library without the need to export each one individually.

Renaming the project from .story to .zip allows you to retrieve all the audio and VTT files at once (as though you went through an clicked "Export" on each Media Library item), but each file has a new generated file name.

The theme.xml.rels file will allow you to match up the generated names of the audio file with the generated names of the VTT files.

But the final step here would be to match up the generated file names with the original file names as they appear in the Media Library.

Hey Peter!

That's the problem I ran into a year ago.  Storyline obfuscates the connection so, at least for me anyways, I could never make a direct correlation from the original file to the updated filename.  It's tantalizingly close when you see it in the theme.xml.rels file but that's where it seems to break down.  My hunch is that there's a lookup table created so Storyline can make the connection that we can't see - the table is probably being used *when* the course is published which is why we can't see the gyrations they're doing and hence can't make the correlation between what's published and the actual filename within the course.

If one could do what you're suggesting then the solution you're proposing wouldn't be difficult to create and there would be a lot of happy users. :)

Thor

Peter Fitzgerald

I could never make a direct correlation from the original file to the updated filename.

With the help of a couple utilities, I was able to match up the file names by comparing the duration of the original MP3s against those of the renamed files. Certainly not an elegant or foolproof solution, but it could be a timesaver in some cases. Either way, a couple of really useful tools.

NirSoft SysExporter: Copy details from Windows Explorer columns
https://www.nirsoft.net/utils/sysexp.html

Den4B ReNamer: Batch rename files and folders using rules and regular expressions.
https://www.den4b.com/products/renamer 

Matthew Uhrich

This is one of the few things I miss about Captivate, the word publish output included all the captions automatically - made it easy to search my hundreds of output projects for mentions of a particular word/product.  Right now the word output from SL is pretty useless. Could we add captions to that output? How is that document generated?

Lauren Connelly

Hello Matthew!

I appreciate you taking the time to share this feedback with us. I've added your comments to our feature request to reflect the impact of this addition. I'll be sure to share any updates with you in this discussion. I hope community members will share if they find a workaround to help you in the meantime!

James Bertelsen

I have built a tool that generates a report from a Storyline 360 file.

Each row of the report provides:
1. The media filename  (generated by Storyline, and the original filename).
2. The captions filename (generated by Storyline, and the original filename).
3. The captions text (with timecodes and headers) and the plain text (without timecodes and headers).
4. The number of words in each captions file, and in total across the entire project.

I am currently testing the tool, and have successfully generated reports for three .story files.

Can anyone share with me a Storyline 360 file with captions that I can test? I need more examples to ensure that it works consistently.

Thanks!

James Bertelsen

Yes. I'm working now on...

1. Upload your .story file.
2. Get the report and media files returned to you in a zip file.
3. Upload your translated media files and vtt files and get back the story file with all your media and captions in place.

When finished, this should provide a bulk caption export/import solution that makes translating media and captions in story files much easier.

People are welcome to email a link to a .story file while I'm still in the testing phase, if you'd like a report for your story file. 

James Bertelsen

I have created an app that does the following:

  1. Allows me (only, at this stage of development) to upload a Storyline 360 .story file, or select a previously uploaded .story file.
  2. Generates a report table of all audio & video files in the .story file, including:
    1. Names of original and converted media and caption files.
    2. If the caption file already has text, display the text, provide a word count for that file, and the total number of words in all captions in the .story file.
    3. Automatically generate speech-to-text captions files for each media file, individually or as an automated batch process for all media files.
  3. Export:
    1. The report as a Excel .xlsx file
    2. The Storyline 360 .story file, with updated captions
    3. The media and vtt files

I am considering adding:

  1. Search and replace text across all captions in one or multiple .story files.
  2. Machine translation of captions from one language to another for multiple languages
  3. Text-to-speech for multiple languages

I am continuing to develop and test the app. If you have a .story file that you could share for testing, I may be willing to provide you with a report and, possibly, the updated Storyline 360 with auto-generated captions.

Any feedback or suggestions would also be appreciated.

(Video attached.)

Thanks,

Jim Bertelsen

jimbertelsen@gmail.com

Troy May

Create Text Transcript from Storyline Project

Alternative A

  1. Export Video of entire project. If you do not wish to wait a LONG TIME, you can export as a CD, then watch the output directory to determine once SL3 has exported each audio track. Then join the audio together using a free app, change the MP3 to MP4, and...  
  2. Import MP4 video into Microsoft Stream. Stream is part of every 365 subscription. 
  3. Wait for the video to process(2 hours for every hour of audio). Ensure it does so with its closed captions feature ON.
  4. Export the .vtt it auto-generates from the audio on your video.
  5. Use the Microsoft Stream transcript VTT file cleaner to get a big fat text file of your entire project. 
  6. Use your story project file to "clean" the text, separating into "slide" parts. 

 

Alternative B

  1. Wait for Articulate to add the following functionality to SL4.
    • Add .vtt text assembler with new EXPORT ALL CAPTIONS button command on Audio OPTIONS tab.
    • Vtt Assember first considers all audio tracks which have .vtts attached.
    • Vtt Assember joins textual contents of each .vtt attachment.
    • Vtt Assember exports the joined data as as a .txt file. 

For all of us computer science wizards, it's that simple, guys. :)   

Martin Abildgaard

Seriously, I can't believe such a simple bulk export/import feature does not exist!?!

We currently translate to 18 languages, and we have to manually export each cc-file for every piece of audio, have it translated, and manually upload each file for each language - it's excruciatingly time-consuming and should be an automated process!

Come on, Articulate, you can do better!