Export closed captions in bulk?

almost 3 years ago09/24/21 at 2:00 pm (UTC)

I considered writing a script that would take the XML as input, extract the text from each vtt file, and associate the extracted text with the name of the media file as an output report. But it seems to me, the result would just be a listing of original media file names and their captions, in no particular order.

How would that be helpful, as opposed to just having a transcript of all captions, organized by slide, in the order in which they actually appear on the slide and in the course?

Peter Fitzgerald

almost 3 years ago09/24/21 at 2:23 pm (UTC)

That's a good question and it looks like my question about adapting the script could have been better worded. The ultimate goal here is to extract all of the timestamped VTT files named as they appear in the Media Library without the need to export each one individually.

Renaming the project from .story to .zip allows you to retrieve all the audio and VTT files at once (as though you went through an clicked "Export" on each Media Library item), but each file has a new generated file name.

The theme.xml.rels file will allow you to match up the generated names of the audio file with the generated names of the VTT files.

But the final step here would be to match up the generated file names with the original file names as they appear in the Media Library.

almost 3 years ago09/24/21 at 3:52 pm (UTC)

Thor Melicher

Hi, Steve - Not sure what you mean - would you mind enlightening us?

One other thing that I found since my last post is the Media Library within Storyline does make it a tad faster to export your captions, still one at a time but it's not necessary to go one slide at a time, either:

From the menu bar,

Click View | Media Library

Select Video

*Next to Captions, click Edit Captions

Click Export...

*When clicking Edit Captions you would think it would start editing captions but instead you get the following choices: Edit, Replace, Export, Delete, and Apply to All. Not sure what Apply to All means as it is also shown for Alt Text, too.

@Stoyline - can you not add a select box to select the audio/video in the media library you want and then export captions to all of them? It seems like it would be a simple add-on since the media library already groups all the video and audio together and even identifies which ones have captions. They could export as a CSV or even plain text. Video would have to be done separately from audio - but still a lot better than the current option.

Media_Library.png

almost 3 years ago09/26/21 at 4:07 pm (UTC)

Peter Fitzgerald

That's a good question and it looks like my question about adapting the script could have been better worded. The ultimate goal here is to extract all of the timestamped VTT files named as they appear in the Media Library without the need to export each one individually.

Renaming the project from .story to .zip allows you to retrieve all the audio and VTT files at once (as though you went through an clicked "Export" on each Media Library item), but each file has a new generated file name.

The theme.xml.rels file will allow you to match up the generated names of the audio file with the generated names of the VTT files.

But the final step here would be to match up the generated file names with the original file names as they appear in the Media Library.

Hey Peter!

That's the problem I ran into a year ago. Storyline obfuscates the connection so, at least for me anyways, I could never make a direct correlation from the original file to the updated filename. It's tantalizingly close when you see it in the theme.xml.rels file but that's where it seems to break down. My hunch is that there's a lookup table created so Storyline can make the connection that we can't see - the table is probably being used *when* the course is published which is why we can't see the gyrations they're doing and hence can't make the correlation between what's published and the actual filename within the course.

If one could do what you're suggesting then the solution you're proposing wouldn't be difficult to create and there would be a lot of happy users. :)

Thor

almost 3 years ago09/26/21 at 8:42 pm (UTC)

Peter, this is amazing! Saved me a ton of time!

Thank you so much for sharing!

almost 3 years ago09/27/21 at 2:48 pm (UTC)

I could never make a direct correlation from the original file to the updated filename.

With the help of a couple utilities, I was able to match up the file names by comparing the duration of the original MP3s against those of the renamed files. Certainly not an elegant or foolproof solution, but it could be a timesaver in some cases. Either way, a couple of really useful tools.

NirSoft SysExporter: Copy details from Windows Explorer columns
https://www.nirsoft.net/utils/sysexp.html

Den4B ReNamer: Batch rename files and folders using rules and regular expressions.
https://www.den4b.com/products/renamer

almost 3 years ago09/27/21 at 2:49 pm (UTC)

Glad it worked out!

Matthew Uhrich

over 2 years ago11/10/21 at 5:47 pm (UTC)

This is one of the few things I miss about Captivate, the word publish output included all the captions automatically - made it easy to search my hundreds of output projects for mentions of a particular word/product. Right now the word output from SL is pretty useless. Could we add captions to that output? How is that document generated?

over 2 years ago11/15/21 at 1:52 pm (UTC)

Hello Matthew!

I appreciate you taking the time to share this feedback with us. I've added your comments to our feature request to reflect the impact of this addition. I'll be sure to share any updates with you in this discussion. I hope community members will share if they find a workaround to help you in the meantime!

James Bertelsen

over 2 years ago02/11/22 at 9:28 pm (UTC)

I have built a tool that generates a report from a Storyline 360 file.

Each row of the report provides:
1. The media filename (generated by Storyline, and the original filename).
2. The captions filename (generated by Storyline, and the original filename).
3. The captions text (with timecodes and headers) and the plain text (without timecodes and headers).
4. The number of words in each captions file, and in total across the entire project.

I am currently testing the tool, and have successfully generated reports for three .story files.

Can anyone share with me a Storyline 360 file with captions that I can test? I need more examples to ensure that it works consistently.

Thanks!

over 2 years ago02/11/22 at 9:56 pm (UTC)

Jim, that's great. I've attached a lesson you can test. Let me know if you'd like me to break it up into several smaller files.

Lesson2percents.story

over 2 years ago02/11/22 at 10:09 pm (UTC)

Peter, that worked perfectly. Thanks!

https://docs.google.com/spreadsheets/d/1_zKYer_OUvgS_lg8ZI8RALhnwBUq6SXlsd4sDnSGyBg/edit?usp=sharing

Report

Peter Fitzgerald

over 2 years ago02/11/22 at 10:16 pm (UTC)

Very nice. This will be really useful. Any chance that it can work in reverse? E.g., bulk import VTTs to match them up with the audio file based on file name or duration?

Now that AWS Transcribe can output directly to VTT, it would save a lot of people a lot of time.

over 2 years ago02/11/22 at 10:30 pm (UTC)

Yes. I'm working now on...

1. Upload your .story file.
2. Get the report and media files returned to you in a zip file.
3. Upload your translated media files and vtt files and get back the story file with all your media and captions in place.

When finished, this should provide a bulk caption export/import solution that makes translating media and captions in story files much easier.

People are welcome to email a link to a .story file while I'm still in the testing phase, if you'd like a report for your story file.

Thor Melicher

over 2 years ago02/11/22 at 10:40 pm (UTC)

Jim,

That's looking very promising - nice work! For a future version, you might want to consider a 'local' option. Many won't want to send (or won't be able to send) via a service outside of their corporate network.

James Bertelsen

over 2 years ago02/27/22 at 8:13 pm (UTC)

I have created an app that does the following:

Allows me (only, at this stage of development) to upload a Storyline 360 .story file, or select a previously uploaded .story file.
Generates a report table of all audio & video files in the .story file, including:
1. Names of original and converted media and caption files.
2. If the caption file already has text, display the text, provide a word count for that file, and the total number of words in all captions in the .story file.
3. Automatically generate speech-to-text captions files for each media file, individually or as an automated batch process for all media files.
Export:
1. The report as a Excel .xlsx file
2. The Storyline 360 .story file, with updated captions
3. The media and vtt files

I am considering adding:

Search and replace text across all captions in one or multiple .story files.
Machine translation of captions from one language to another for multiple languages
Text-to-speech for multiple languages

I am continuing to develop and test the app. If you have a .story file that you could share for testing, I may be willing to provide you with a report and, possibly, the updated Storyline 360 with auto-generated captions.

Any feedback or suggestions would also be appreciated.

(Video attached.)

Thanks,

Jim Bertelsen

jimbertelsen@gmail.com

storyline-caption-generator.mp4

Oskar Landowski

over 2 years ago04/25/22 at 9:05 am (UTC)

I also want to signal the need to export captions in a bulk, because for large-number video localization having to export them one by one is unnecessarily tedious and time-consuming.

over 2 years ago04/25/22 at 9:04 pm (UTC)

Adding closed caption export/import with the translation feature seems like a great opportunity!

Troy May

2 years ago04/29/22 at 5:37 pm (UTC)

Create Text Transcript from Storyline Project

Alternative A

Export Video of entire project. If you do not wish to wait a LONG TIME, you can export as a CD, then watch the output directory to determine once SL3 has exported each audio track. Then join the audio together using a free app, change the MP3 to MP4, and...
Import MP4 video into Microsoft Stream. Stream is part of every 365 subscription.
Wait for the video to process(2 hours for every hour of audio). Ensure it does so with its closed captions feature ON.
Export the .vtt it auto-generates from the audio on your video.
Use the Microsoft Stream transcript VTT file cleaner to get a big fat text file of your entire project.
Use your story project file to "clean" the text, separating into "slide" parts.

Alternative B

Wait for Articulate to add the following functionality to SL4.
- Add .vtt text assembler with new EXPORT ALL CAPTIONS button command on Audio OPTIONS tab.
- Vtt Assember first considers all audio tracks which have .vtts attached.
- Vtt Assember joins textual contents of each .vtt attachment.
- Vtt Assember exports the joined data as as a .txt file.

For all of us computer science wizards, it's that simple, guys. :)

2 years ago04/29/22 at 5:40 pm (UTC)

This I would love to see.

Valesa Clouse
Senior Instructional Designer
Learning Solutions
P 615.344.7381 | M 615.927.1159
HCAhealthcare.com | Connect With Us

[Brandon Hall Award]

Martin Abildgaard

almost 2 years ago09/01/22 at 12:46 pm (UTC)

Seriously, I can't believe such a simple bulk export/import feature does not exist!?!

We currently translate to 18 languages, and we have to manually export each cc-file for every piece of audio, have it translated, and manually upload each file for each language - it's excruciatingly time-consuming and should be an automated process!

Come on, Articulate, you can do better!

Tracy Hughes

almost 2 years ago09/11/22 at 8:56 pm (UTC)

This user is looking for the same ability as well...

James Bertelsen

over 1 year ago12/09/22 at 4:48 pm (UTC)

Here's another solution that will export a transcript of all captions in a Storyline 360 file. A working example story file is attached.

https://www.youtube.com/watch?v=wL-s9fJALjQ

Questions: jimbertelsen@gmail.com

captions_test.story

3 months ago04/18/24 at 1:30 pm (UTC)

this would work for video, but not audio voiceover on every slide though

3 months ago04/18/24 at 2:21 pm (UTC)

Things have changed a lot since this thread started - if you're interested in learning more, please reply to me privately here or reach out via LinkedIn.

73 Replies