I'm confused by the 79% claim of improvement on audio only vs audio plus redundant on-screen text; it seems to be a reference to this line in the cited source, except the line says basically the opposite: "In this situation, learners who received redundant on-screen text and spoken text generated an average of 79 percent more correct answers on a problem-solving test than learners who received only spoken text (Moreno & Mayer, 2002a)."
The situation they're describing with the 79% improvement is one where "the learner sees and hears a sentence, then views ten seconds of animation corresponding to it, then sees and hears the next sentence, then views ten seconds of corresponding animation, and so on" -- exactly the sort of text on screen being read situation this article claims is bad.
I don't want to seem too down, even though it looks like that part of the research result was accidentally reversed; another part of the research result was that interspersed animation also greatly improved performance (as measured in some simple retention, transfer, and matching tests of knowledge), so the advice to use animation is also good, but having text with redundant narration in between short segments of animation resulted in even more performance than animation and narration alone.
They go on to elaborate that 79% isn't even the largest possible improvement: "Research shows that in certain situations learners generate approximately three times as many correct answers on a problem-solving trans- fer test from presentations containing concurrent spoken and printed text than from spoken text alone (Moreno & Mayer, 2002a)."
Those are from the cited “E-learning and the Science of Instruction”; if you're curious about the research paper they're relying on above, you can find it at
http://tecfa.unige.ch/tecfa/teaching/methodo/MorenoMayer2002.pdf