Forum Discussion

Community Member

4 years ago

Translating entire Storyline files with an alternative method

Dear Storyline 360 powerusers / Articulate support team,

[forgive my clumsy English]
My name is Manuel, and I work for a cat-tool editor/language service provider.
I am currently exploring new ways of having storyline files translated.

Quick reminder
To get a .story project file translated, there is only one 'official' way wich consists in exporting the content to be translated as an XLIFF file or a DOCX file, translating the latter, and reimporting it into Storyline 360.

Unfortunately, with this method, some contents are not exported into the translation file -- namely, the Text-To-Speech elements(TTS). And anyone who ever had to do it knows it is a tedious and time-consuming job to copy-paste each TTS into a text file, have it translated, then reinsert it and update the audio and subtitles.

As a LSP, we are always trying to optimize our translation processes and propose enhanced services to our customers. So, to get rid of that time-consuming TTS handling step, we decided to translate the translatable content directly from the slide[...].xml files compressed within a .storyline project file. To make sure we do not corrupt anything else in the slide[...].xml files, we have our CAT-tool (Computer-Aided Translation tool) protect all the non-translatable data. (There is more to it, but let me just skip the technical details, here.)

And... success! With this method, we were able to translate all the translatable content, including the TTS. We do not even need to set a new language + voice for each TTS, because during a 'post-translation' step, we automatically convert the 'original' TTS properties (language + gender + voice name/id) into 'target' TTS properties.

This is great, but there is still room for optimization. One thing that annoys us is that we have to "force" the recreation of the audio file + subtitle(.mp3 + .vtt within the .story file). To do so, after the translation and post-translation step are complete, we currently open the TTS for edition, insert an extra space in order to have Storyline believe the content is changed (otherwise, the content will not be updated) and click on the "Update" button. Not an optimal handling, but a lot of DTP editing work needs to be performed on a translated .storyline project file, anyway.

Yet, up to several hours of work per translated file could be saved if it was possible to regenerate all the audio + subtitiles at once, which, as far as I know, is not possible in Storyline 360. That is why we are now trying to improve our very special processing of .storyline project files -- an evolution of our process that will include target language MP3 + VTT automatic regeneration.

And again, we did it!
- a colleague of mine developped an app that rebuilds an audio stream using the target language + voice name + TTS content through AWS "Polly" API (this very API is used by Storyline 360 to generate the audio), then saves it under the right name as a mp3 file.
- the translatable content from the .vtt files is made available for translation in our CAT-tool.

Hurray!... Still, there is an annoying glitch (and that is the reason for this long message): although the content of the "unconventionally" translated .storyline file no longer contains source language TTS, the source audio content of the same TTS can be heard upon publication and -- sometimes -- when performing a preview.

- To work around the issue, I tried to clear the Storyline cache files :
+ c:\Users\<username>\AppData\Roaming\Articulate\Storyline
+ c:\Users\<username>\AppData\Local\Temp\Articulate\Storyline\
+ c:\Users\mbe\AppData\Local\Articulate\360\ApiCache\
It did not suffice: the "phantom" source audio kept reappearing upon publication
I thought an old version of the source audio might remain in the .story project file, so I searched high and low but could not find anything suspicious.
I just do not understand how: it either means the old source audio is somehow hiding in the .story file (but I thoroughly searched for any trace of the original audio or the original text -- not the slightest hint of anything), or it means there is another cache, somewhere, somehow (I searched hard but could not find it).

- For those who might wonder: I looked into the temp folder used for building the publication (eg: c:\Users\<username>\AppData\Local\Temp\Articulate\Storyline\5oPaU2GP1u5\story_content\5qkiLe6ifXF.mp3 and 5qkiLe6ifXF.wav and temp1.wav \Preview\story_content\5qkiLe6ifXF_44100_112_0.mp3). There, the audio file contained the target language audio. But under the publication folder (eg: d:\...\test Storyline output\story_content\5qkiLe6ifXF_44100_112_1.mp3) , the audio file contains the source language audio!
- The only way I could work around this "phantom cache issue" was to publish the translated .story project file from a PC with a fresh Storyline 360 installation and on which the source language .storyline project was never opened. But this is not an acceptable workaround, because when we deliver the translated .story file to the customer, we can be certain they will have an issue upon publishing.
- This issue disappears if we open each TTS for edition, add an extra space in order to have Storyline believe the content is changed (otherwise, the content will not be updated) and click on the "Update" button. But that is not a solution: this is just what we currently do.

So please let me know if you have any idea at all about how I can get rid of that "phantom" source audio from reappearing. Or if you know about a mysterious cache I missed.

By the way, I am using Storyline 360 V3.66.28270.0

Thanks if you went that far and read my message!
PS: I attached two files -- a proof-of-concept -- to illustrate what I described:
- test_diapo_simple_1tts_01.story >> original storyline file. Open that file and publish it.
- test_diapo_simple_1tts_01trad.story >> translated storyline file. Open that file and publish it : the audio will remain in English though it's been translated into French and though the MP3 is up to date.

test_diapo_simple_1tts_01trad.story5.6 MB

test_diapo_simple_1tts_01.story5.6 MB

storyline 360

24 Replies

JoseTansengco
Staff
4 years ago
Hi Manuel,

Thanks for sharing a detailed description of how you are improving translating text in Storyline 360. I went ahead and opened a support case for you so our support team can address your inquiry regarding the audio file. Your case is in good hands, and a member of our support team will be in touch with you shortly!
Jürgen_Schoene_
Community Member
4 years ago
the audio will remain in English though it's been translated into French and though the MP3 is up to date.

that happen to me also, I have automaticly updated german audios with translates audios - and preview was perfect, but publishing not - everytimes the original german mp3 were published.

It was not possible to delete this global cloud (???) cache

unfortunately there is no documentation how the internal IDs are generated. That's why I kept the original IDs for the translated mp3s - that was probably the problem. ID's cannot be reused.

so I had to manually click 8 (languages) x 250 (audios) the update audio button (with about 20-30 hard crashes* of storyline)

Jürgen

* the support ticket had no result

PS: I use Python 3 for the storyline parser
- ThorMelicher-b5
  Community Member
  4 years ago
  Like Manuel and Jürgen, I've developed companion apps for Storyline and Rise 360 (due to forum rules, I can't mention these by name as it could be considered self-promotion). I tested my latest app using Manuel's files (thank you for posting these!) and I think I have a better understanding now of what is happening.
  
  Here are the steps I took to get to my below hypothesis:
  
  I took Manuel's first Storyline file (English audio) and replaced the audio with my app. I then published the course and the replacement audio worked as expected (it was not the original audio but the replacement audio I chose).
  
  I then opened Manuel's second file, noted the French replaced audio, made no changes and published immediately. The audio that played was the audio I replaced in the first file - not Manuel's English file or French file that appears on the timeline.
  
  I went ahead and replaced the audio with my app on Manuel's second file with a copy of the French audio that I named differently. Again, the audio that played was the audio I replaced in the first file.
  
  And then to close out testing, I went back to the first file of Manuel's, replaced the original file's audio with my app again and... same result. The very first audio file I used played and not the new audio.
  
  Here's my hypothesis in what might be happening in this specific situation. Please jump in and let me know where I'm wrong in my assumptions:
  
  The Storyline application uses an internal settings file. It is not a cache per se but rather a table that stores a unique ID for both external assets (audio, image, video) and internal assets (textbox, button, etc.) added to the application as a whole and not a specific .story file. One cannot simply delete the cache to 'cure this problem' - this internal file also stores application settings, etc. so you likely wouldn't want to delete it even if you could.
  
  The unique ID is likely a checksum of the file from my simple testing.
  
  Storyline seems to fetch a new ID only when the replace media function is called by the user. With the new ID added to the internal settings file, the course publishes as expected (the replaced audio is what you expect to see in the published version of the course)
  
  Items with this specific ID seem to be 'timeline bound' - it has to be placed on the timeline to get an ID. Captions do not live on the timeline but rather are added to another object so doing a replacement with my app has no problem.
  
  The ID is only 'checksum unique' and not truly a new ID created each and every time an asset is added. If that were the case, my experience above should not be repeatable (and perhaps it isn't but with the testing I did, it seems so - Manuel had the perfect example because the second file is a direct copy and not a new Storyline course.)
  
  I think part of the solution to the problem would be to refresh the internal settings file. I can't think of a way to do this in how Storyline is programmed today. It would have to be an internal programming change. The developers wouldn't want to do the refresh often though because there would likely be a significant performance hit as every asset (external and internal) has an assigned ID.
  
  I hope this all makes sense and leads to further discussion.
  
  Thor
  - ThorMelicher-b5
    Community Member
    4 years ago
    Now, with a little more testing, let's talk about a possible work around. It's not perfect but more on that in just a moment.
    
    Go to Publish Settings
    
    Change any of the audio settings (Audio bitrate or Optimize Audio Volume)
    
    Click Ok
    
    Click Publish
    
    By making a simple change, Storyline will re-encode the audio and generate the replaced audio as expected.
    
    Here's why it's not perfect - go back to Publish Settings and change your audio back to exactly as it was and then publish again. Surprised? The previous audio file plays. Change it back one more time to exactly what you had and publish again. Once again, the expected audio plays.
    
    So... I mentioned earlier about the internal file with settings, but I also think there's an internal cache (stored locally) as suggested. This internal cache isn't in sync because "it only knows what it knows" and it's using that file checksum ID to find what's supposted to be there. When changing the audio, it forces a new file ID, and then publishes the audio as expected.
    
    FYI - When publishing, I noticed a temporary file (temp1.wav) is created here (C:\Users\username\AppData\Local\Temp\Articulate\Storyline\{foldername}
    
    but it wasn't the file I heard after publishing. It was, however, the file I wanted to hear. More to discover, right?
    
    Thor
Jürgen_Schoene_
Community Member
4 years ago
delete all caches would be only a hotfix for the problem - I am looking for a solution that does not require this, because our customers want to modify the storyline file themselves and then publish it in a 'normal' environment.

Jürgen
Jürgen_Schoene_
Community Member
4 years ago
I had a look in my programming from april

that's what I have done to convert audios from one language to another

prepare:
- folder 1 with all mp3 (original language)
- folder 2 with all mp3 (translated language) - same file name like in folder 1

step 1: for all audios in the media library (story/theme/theme.xml)
- replace all audios paths with the corresponding audios from folder 2
- update all audio properties
(md5 <steam + source>, useCnt, bytes, originBytes, date, ...)

step2: for all audios in the media library files (story/media/.... .mp3)
- replace all audios with the audios from folder 2 - files renamed with the original internal filenames

=> i re-use the old IDs and internal filenames from the original language audios for the translated audios
=> i suspect that is exactly the problem

unfortunately I could not find out how to generate the new internal file name for a new generated ID

but maybe there is someone who knows (@articulate ???)

Jürgen
ThorMelicher-b5
Community Member
4 years ago
Great dialogue everyone! As we're all looking from the outside in, I think we're getting to the same realization - the holdup for the approach we're taking (all sound very similar) to function reliably is the inability to convince Storyline to either refresh its internal cache, generate a new ID to match the replaced file, or a combination of both.

We know the functionality is within Storyline as it occurs in the Media library or simply right clicking the file and choosing replace audio. However, this is manual and isn't a good solution when you have a course of any significant length.

Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :). I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.

Thor
- ManuelBERRI
  Community Member
  4 years ago
  
  Thor Melicher
  
  Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :). I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.
  
  Sure. I am away until the end of the month, but I will definitely try to keep you posted if I am provided with some useful information from Articulate support.
  - ManuelBERRI
    Community Member
    4 years ago
    
    Manuel BERRI
    
    Thor Melicher
    
    Manuel, please keep us up to date on what you hear back from Articulate, that is if you're able to share :). I think your two files illustrate the problem well and with our sharing of what we're seeing may lead to a viable solution for everyone.
    
    Sure. I am away until the end of the month, but I will definitely try to keep you posted if I am provided with some useful information from Articulate support.
    
    Hi everyone,
    
    Finally back to work, and back to this annoying issue.
    I received a standard answer from Articulate support :
    "[...]it appears that your process of translating your courses isn't using the standard procedure of exporting the XLIFF file or Word file, and you're modifying the content of the published output. Articulate software and its published output are supported as they are. I'm sorry, but modifying the published output isn't something we are equipped to handle, and we don't offer advice on it so that it will work in a specific environment.[...]"
    
    This was not a surprise, really. But still, this is a disappointment.
    
    So I guess we have to find our own way to solving this. Any idea where this "internal cache is located, anyone ?
Jürgen_Schoene_
Community Member
4 years ago
all tests are made with (Microsoft) Process Monitor on Windows 10

TCP tracing - here is a tutorial

the data are stored in Frankfurt/Main (Germany) with testing Amazon Polly (Storyline Text to Speech) -> that is ok
JoseTansengco
Staff
4 years ago
Hi Jürgen,

Happy to clarify some of the things that you raised!

Storyline 360 does use Amazon Polly as its TTS service, which means it will attempt to connect to the service whenever TTS is used in a course. This is most likely the connection to the endpoints that appeared in your monitoring. One way to test this is to publish a course that does not use TTS, and check if you will still see any connections to AWS.

Articulate 360 will never publish customer data to our servers without our users's knowledge, and you can check out all relevant documentation pertaining to data security here.

If you still have any concerns about customer data, please feel free to open a ticket with our support team here so we can investigate.

Let me know if you have any questions!
Jürgen_Schoene_
Community Member
4 years ago
>Articulate 360 will never publish customer data to our servers without our users's knowledge, ...

which data are copied to amazon cloud (us) when I publish to local drive ?

are data from our customers
- text
- audios
- images
- videos
copied ?

if yes, where can i turn this off ?

if no, everything is ok

Jürgen
JoseTansengco
Staff
4 years ago
Hi Jürgen,

No data is copied to our servers when you publish locally. Storyline 360 connects to Amazon Polly for any TTS related actions, so you'll occasionally see the application connecting to AWS endpoints if you are using the TTS service in your course.
Jürgen_Schoene_
Community Member
4 years ago
I don't use TTS and storyline connects to Amazon on publish local*

- ec2-52-45-64-241.compute-1.amazon.aws.com (Ashburn, Virginia, United States)
- ec2-44-196-72-199.compute.1.amazon.aws.com (Ashburn, Virginia, United States)

are this TTS instances?

Jürgen

* test was an empy .story file with one rectrangle and three mp3 included (no compress)

with the second publish (without changes) no connections to aws (local cache ?)

third publish with one mp3 changed one connection to another aws instance
- MathNotermans-9
  Community Member
  4 years ago
  And if you remove the mp3s...might be somewhere in the libraries used for the audio ?