Forum Discussion
Week 1 Discussion
1. What surprised you about how AI handled text or media generation?
I was quite pleased with the AI image generation of the character "Saanvi" the AI feature instinctively created an East-Indian looking female (based on the name: Saanvi). I thought I would have to redo the prompt with specific instructions on the ethnicity of the character once the images were generated. Turns out, I didn't need to refine the prompt.
2. What challenges did you face when trying to get AI to produce the results you wanted?
I've run into this several times when using the AI TTS feature. Some iterations have the "voice" correctly pronounce words; but then, if I need to change phrasing or add more context, the generative voice will now mis-pronounce the word that was previously pronounced correctly! Also, when using acronyms, in some instances the voice will pronounce the acronym as a word (which is what I want), other times it will read out each individual letter. I came across one instance (in my own project) where I needed the AI generated voice to read "MIPS" as a word. I tried several different methods to have it pronounced as "mips", but it just would not comply! Are there any hints, tips etc. for fine tuning pronunciations? Is there a way to copy/paste phonetic spellings into the text box to guide the pronunciation?
However, the CC generation capabilities are fantastic! So much time is saved by not having to import a separate .srt, .vtt etc. file to generate CC!
I agree about the generated image of Saanvi. I was curious to see if it would infer how someone looks based on a name. That's another thing users of this tool will need to check for as they build courses, whether they like the inferences or not.