Translating entire Storyline files with an alternative method

Aug 02, 2022

Dear Storyline 360 powerusers / Articulate support team,

[forgive my clumsy English]
My name is Manuel, and I work for a cat-tool editor/language service provider.
I am currently exploring new ways of having storyline files translated.

Quick reminder
To get a .story project file translated, there is only one 'official' way wich consists in exporting the content to be translated as an XLIFF file or a DOCX file, translating the latter, and reimporting it into Storyline 360.

Unfortunately, with this method, some contents are not exported into the translation file -- namely, the Text-To-Speech elements(TTS). And anyone who ever had to do it knows it is a tedious and time-consuming job to copy-paste each TTS into a text file, have it translated, then reinsert it and update the audio and subtitles.

As a LSP, we are always trying to optimize our translation processes and propose enhanced services to our customers. So, to get rid of that time-consuming TTS handling step, we decided to translate the translatable content directly from the slide[...].xml files compressed within a .storyline project file. To make sure we do not corrupt anything else in the slide[...].xml files, we have our CAT-tool (Computer-Aided Translation tool) protect all the non-translatable data. (There is more to it, but let me just skip the technical details, here.)

And... success! With this method, we were able to translate all the translatable content, including the TTS. We do not even need to set a new language + voice for each TTS, because during a 'post-translation' step, we automatically convert the 'original' TTS properties (language + gender + voice name/id) into 'target' TTS properties.

This is great, but there is still room for optimization. One thing that annoys us is that we have to "force" the recreation of the audio file + subtitle(.mp3 + .vtt within the .story file). To do so, after the translation and post-translation step are complete, we currently open the TTS for edition, insert an extra space in order to have Storyline believe the content is changed (otherwise, the content will not be updated) and click on the "Update" button. Not an optimal handling, but a lot of DTP editing work needs to be performed on a translated .storyline project file, anyway.

Yet, up to several hours of work per translated file could be saved if it was possible to regenerate all the audio + subtitiles at once, which, as far as I know, is not possible in Storyline 360. That is why we are now trying to improve our very special processing of .storyline project files -- an evolution of our process that will include target language MP3 + VTT automatic regeneration.

And again, we did it!
- a colleague of mine developped an app that rebuilds an audio stream using the target language + voice name + TTS content through AWS "Polly" API (this very API is used by Storyline 360 to generate the audio), then saves it under the right name as a mp3 file.
- the translatable content from the .vtt files is made available for translation in our CAT-tool.

Hurray!... Still, there is an annoying glitch (and that is the reason for this long message): although the content of the "unconventionally" translated .storyline file no longer contains source language TTS, the source audio content of the same TTS can be heard upon publication and -- sometimes -- when performing a preview.

- To work around the issue, I tried to clear the Storyline cache files :
+ c:\Users\<username>\AppData\Roaming\Articulate\Storyline
+ c:\Users\<username>\AppData\Local\Temp\Articulate\Storyline\
+ c:\Users\mbe\AppData\Local\Articulate\360\ApiCache\
It did not suffice: the "phantom" source audio kept reappearing upon publication
I thought an old version of the source audio might remain in the .story project file, so I searched high and low but could not find anything suspicious.
I just do not understand how: it either means the old source audio is somehow hiding in the .story file (but I thoroughly searched for any trace of the original audio or the original text -- not the slightest hint of anything), or it means there is another cache, somewhere, somehow (I searched hard but could not find it).

- For those who might wonder: I looked into the temp folder used for building the publication (eg: c:\Users\<username>\AppData\Local\Temp\Articulate\Storyline\5oPaU2GP1u5\story_content\5qkiLe6ifXF.mp3 and 5qkiLe6ifXF.wav and temp1.wav \Preview\story_content\5qkiLe6ifXF_44100_112_0.mp3). There, the audio file contained the target language audio. But under the publication folder (eg: d:\...\test Storyline output\story_content\5qkiLe6ifXF_44100_112_1.mp3) , the audio file contains the source language audio!
- The only way I could work around this "phantom cache issue" was to publish the translated .story project file from a PC with a fresh Storyline 360 installation and on which the source language .storyline project was never opened. But this is not an acceptable workaround, because when we deliver the translated .story file to the customer, we can be certain they will have an issue upon publishing.
- This issue disappears if we open each TTS for edition, add an extra space in order to have Storyline believe the content is changed (otherwise, the content will not be updated) and click on the "Update" button. But that is not a solution: this is just what we currently do.

So please let me know if you have any idea at all about how I can get rid of that "phantom" source audio from reappearing. Or if you know about a mysterious cache I missed.


By the way, I am using Storyline 360 V3.66.28270.0

Thanks if you went that far and read my message!
PS: I attached two files -- a proof-of-concept -- to illustrate what I described:
- test_diapo_simple_1tts_01.story >> original storyline file. Open that file and publish it.
- test_diapo_simple_1tts_01trad.story >> translated storyline file. Open that file and publish it : the audio will remain in English though it's been translated into French and though the MP3 is up to date.

24 Replies
Jose Tansengco

Hi Manuel, 

Thanks for sharing a detailed description of how you are improving translating text in Storyline 360. I went ahead and opened a support case for you so our support team can address your inquiry regarding the audio file. Your case is in good hands, and a member of our support team will be in touch with you shortly! 

Jürgen Schoenemeyer
the audio will remain in English though it's been translated into French and though the MP3 is up to date.

that happen to me also, I have automaticly updated german audios with translates audios - and preview was perfect, but publishing not - everytimes the original german mp3 were published.

It was not possible to delete this global cloud (???) cache

unfortunately there is no documentation how the internal IDs are generated. That's why I kept the original IDs for the translated mp3s - that was probably the problem. ID's cannot be reused.

so I had to manually click 8 (languages) x 250 (audios) the update audio button (with about 20-30 hard crashes* of storyline)

Jürgen

* the support ticket had no result

PS: I use Python 3 for the storyline parser

 

Thor Melicher

Like Manuel and Jürgen, I've developed companion apps for Storyline and Rise 360 (due to forum rules, I can't mention these by name as it could be considered self-promotion).  I tested my latest app using Manuel's files (thank you for posting these!) and I think I have a better understanding now of what is happening.

Here are the steps I took to get to my below hypothesis:

  • I took Manuel's first Storyline file (English audio) and replaced the audio with my app.  I then published the course and the replacement audio worked as expected (it was not the original audio but the replacement audio I chose).
  • I then opened Manuel's second file, noted the French replaced audio, made no changes and published immediately.  The audio that played was the audio I replaced in the first file - not Manuel's English file or French file that appears on the timeline.
  • I went ahead and replaced the audio with my app on Manuel's second file with a copy of the French audio that I named differently.  Again, the audio that played was the audio I replaced in the first file.
  • And then to close out testing, I went back to the first file of Manuel's, replaced the original file's audio with my app again and... same result.  The very first audio file I used played and not the new audio. 

Here's my hypothesis in what might be happening in this specific situation.  Please jump in and let me know where I'm wrong in my assumptions:

  • The Storyline application uses an internal settings file.  It is not a cache per se but rather a table that stores a unique ID for both external assets (audio, image, video) and internal assets (textbox, button, etc.) added to the application as a whole and not a specific .story file.  One cannot simply delete the cache to 'cure this problem' - this internal file also stores application settings, etc. so you likely wouldn't want to delete it even if you could.
  • The unique ID is likely a checksum of the file from my simple testing.
  • Storyline seems to fetch a new ID only when the replace media function is called by the user.  With the new ID added to the internal settings file, the course publishes as expected (the replaced audio is what you expect to see in the published version of the course)
  • Items with this specific ID seem to be 'timeline bound' - it has to be placed on the timeline to get an ID.  Captions do not live on the timeline but rather are added to another object so doing a replacement with my app has no problem.
  • The ID is only 'checksum unique' and not truly a new ID created each and every time an asset is added.  If that were the case, my experience above should not be repeatable (and perhaps it isn't but with the testing I did, it seems so - Manuel had the perfect example because the second file is a direct copy and not a new Storyline course.)

I think part of the solution to the problem would be to refresh the internal settings file.  I can't think of a way to do this in how Storyline is programmed today.  It would have to be an internal programming change.  The developers wouldn't want to do the refresh often though because there would likely be a significant performance hit as every asset (external and internal) has an assigned ID.

I hope this all makes sense and leads to further discussion.

Thor

Thor Melicher

Now, with a little more testing, let's talk about a possible work around.  It's not perfect but more on that in just a moment.

  1. Go to Publish Settings
  2. Change any of the audio settings (Audio bitrate or Optimize Audio Volume)
  3. Click Ok
  4. Click Publish

By making a simple change, Storyline will re-encode the audio and generate the replaced audio as expected.

Here's why it's not perfect - go back to Publish Settings and change your audio back to exactly as it was and then publish again.  Surprised?  The previous audio file plays.  Change it back one more time to exactly what you had and publish again.  Once again, the expected audio plays.

So...  I mentioned earlier about the internal file with settings, but I also think there's an internal cache (stored locally) as suggested.  This internal cache isn't in sync because "it only knows what it knows" and it's using that file checksum ID to find what's supposted to be there.  When changing the audio, it forces a new file ID, and then publishes the audio as expected.  

FYI - When publishing, I noticed a temporary file (temp1.wav) is created here (C:\Users\username\AppData\Local\Temp\Articulate\Storyline\{foldername}

but it wasn't the file I heard after publishing.  It was, however, the file I wanted to hear.  More to discover, right?

Thor

Jürgen Schoenemeyer

I had a look in my programming from april

that's what I have done to convert audios from one language to another

prepare:
 - folder 1 with all mp3 (original language)
 - folder 2 with all mp3 (translated language) - same file name like in folder 1

step 1: for all audios in the media library (story/theme/theme.xml)
 - replace all audios paths with the corresponding audios from folder 2
 - update all audio properties
   (md5 <steam + source>, useCnt, bytes, originBytes, date, ...)

step2: for all audios in the media library files (story/media/.... .mp3)
 - replace all audios with the audios from folder 2 - files renamed with the original internal filenames

=> i re-use the old IDs and internal filenames from the original language audios for the translated audios
=> i suspect that is exactly the problem

unfortunately I could not find out how to generate the new internal file name for a new generated ID

but maybe there is someone who knows (@articulate ???)

Jürgen

Manuel BERRI
Thor Melicher

[...]

  • The Storyline application uses an internal settings file.  It is not a cache per se but rather a table that stores a unique ID for both external assets (audio, image, video) and internal assets (textbox, button, etc.) added to the application as a whole and not a specific .story file.  One cannot simply delete the cache to 'cure this problem' - this internal file also stores application settings, etc. so you likely wouldn't want to delete it even if you could.

[...]

So...  I mentioned earlier about the internal file with settings, but I also think there's an internal cache (stored locally) as suggested. 

Hi Thor,

Thanks for your insightful input. That is really helpful and I feel less alone! 
Although this internal file with settings cannot be deleted, perhaps the internal cache -- wherever it is -- can be deleted. 
If we can locate (and understand ?) both these files, we may find a suitable workaround to our problems.

Manuel

Thor Melicher

Great dialogue everyone!  As we're all looking from the outside in, I think we're getting to the same realization - the holdup for the approach we're taking (all sound very similar) to function reliably is the inability to convince Storyline to either refresh its internal cache, generate a new ID to match the replaced file, or a combination of both.

We know the functionality is within Storyline as it occurs in the Media library or simply right clicking the file and choosing replace audio.  However, this is manual and isn't a good solution when you have a course of any significant length.

Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :).  I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.

Thor

Manuel BERRI
Thor Melicher

Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :).  I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.

Sure. I am away until the end of the month, but I will definitely try to keep you posted if I am provided with some useful information from  Articulate support.

Manuel BERRI
Manuel BERRI
Thor Melicher

Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :).  I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.

Sure. I am away until the end of the month, but I will definitely try to keep you posted if I am provided with some useful information from  Articulate support.

Hi everyone,

Finally back to work, and back to this annoying issue.
I received a standard answer from Articulate support : 
"[...]it appears that your process of translating your courses isn't using the standard procedure of exporting the XLIFF file or Word file, and you're modifying the content of the published output. Articulate software and its published output are supported as they are. I'm sorry, but modifying the published output isn't something we are equipped to handle, and we don't offer advice on it so that it will work in a specific environment.[...]"

This was not a surprise, really. But still, this is a disappointment.


So I guess we have to find our own way to solving this. Any idea where this "internal cache is located, anyone ?

Math Notermans

Quite interesting discussion. Although i donot use it for translation at times i dove into Storyline360 inners dark secrets. As you unzip a .story file and open up all files there you can find all IDs and change them. I used that to dynamically create default starting templates from Google Sheets as a base setup. 

When changing things changes are your .story files wont work anymore when zipping them up back again and change the extension back to .story , but it might give a clue to how to solve your issues.

Jürgen Schoenemeyer

I am searching the "mysterious" not (???) local cache

with a normal local publish to web with storyline 360 (C:\Users\...) "articulate 360 Desktop Services" communicates with 2 amazon aws services
 
example
 - ec2-52-45-64-241.compute-1.amazon.aws.com (Ashburn, Virginia, United States)
 - ec2-44-196-72-199.compute.1.amazon.aws.com (Ashburn, Virginia, United States)

@ariculate: why are my (european) data with local publish copied to us amazon cloud ???



Manuel BERRI

Hi Jürgen,

But  Storyline uses AWS anyway when generating the audio from the TTS content (Amazon Polly is the name of the API)...

Still I am wondering: is this mysterious cache file that we cannot locate on a distant Amazon server? It could be the answer.  To validate this hypothesis, we could try to publish again a .story file that has been already puslished and has not been changed while putting the local computer offline. Thus we could ensure no external communication is necessary, and that no external  cache file is needed.

I cannot do this at the moment, but feel free to precede me if you can.

Jürgen Schoenemeyer

if this turns out to be true and not just a short-term routing error - storyline will unfortunately no longer be usable in europe

articulate has to stop copy (user ???) data to the cloud, when publishing NOT to the cloud

I see no problems with review 360, because i do not need to use the server - for review I use our own company server

Jose Tansengco

Hi Jürgen, 

Happy to clarify some of the things that you raised! 

Storyline 360 does use Amazon Polly as its TTS service, which means it will attempt to connect to the service whenever TTS is used in a course. This is most likely the connection to the endpoints that appeared in your monitoring. One way to test this is to publish a course that does not use TTS, and check if you will still see any connections to AWS. 

Articulate 360 will never publish customer data to our servers without our users's knowledge, and you can check out all relevant documentation pertaining to data security here

If you still have any concerns about customer data, please feel free to open a ticket with our support team here so we can investigate. 

Let me know if you have any questions!

Jürgen Schoenemeyer

>Articulate 360 will never publish customer data to our servers without our users's knowledge,  ...

which data are copied to amazon cloud (us) when I publish to local drive ?

are data from our customers
 - text
 - audios
 - images
 - videos
copied ?

if yes, where can i turn this off ?

if no, everything is ok

Jürgen

Jürgen Schoenemeyer

I don't use TTS and storyline connects to Amazon on publish local*

- ec2-52-45-64-241.compute-1.amazon.aws.com (Ashburn, Virginia, United States)
- ec2-44-196-72-199.compute.1.amazon.aws.com (Ashburn, Virginia, United States)

are this TTS instances?

Jürgen

* test was an empy .story file with one rectrangle and three mp3 included (no compress)

   with the second publish (without changes) no connections to aws (local cache ?)

   third publish with one mp3 changed one connection to another aws instance

Manuel BERRI

Hey there,
Still trying to figure out what is going on.
I performed a few tests recently.
1) OFFLINE TESTS:
a) I opened Storyline 360 then  went offline. You need an internet connexion to be able to open the Text-To-Speech dialog in Storyline 360 (you get an error message when attempting to do so). It means that a connexion to one or several WebServices are established, and then, once you have typed in you text and clicked on "update", the Audio + VTT are created on the fly. 
b) I also tried to publish offline to see if the "cache" issue relates to an internal cache or an external one (see my message dated 09/05/22 at 1:19 pm (UTC)) .The project was "successfully" published, but the audio was not updated though [sigh].

Conclusion: these tests tend to show there must be some sort of internal cache after all. But where the hell is it?

2) MISCELLANEOUS TESTS
a) I tried to modify GUID used in the TTS\@AssetG within slide.xml and the props\@g within story.xml and theme.xml, then I tried to publish again, to see if by any chance, the audio is updated. Alas, it is not, and even worse, the audio + subtitle completely disappear...
b) I tried to delete the MP3 from the storyline archive (under story\media\) to see if this could "force" a rebuild of it. But when you do so, then the Storyline file cannot any longer be opened.
c) I tried to rename the MP3 et replace the old name with the new one in \story\story.xml, \story\slides\slide.xml, \story\slides\_rels\slide.xml.rels and \story\theme\theme.xml. The storyline file can be opened, but the audio file doesn't load nor can be read. Upon publishing the project, the audio is not recreated.

conclusion: MP3 file name  and TTS\@AssetG are not just random name. They are somehow implied in the Audio regeneration at some point (very vague conclusion indeed).

Voilà, that is all for today.

Would be glad to hear about your findings, if any.

Kind regards,