There is absolutely nothing wrong with either of your examples. The question of multiple slides with fewer objects per slide, or fewer slides with many objects per slide is purely a question of your (the designer's) personal preference. But there are some things you can consider. If you are going to move from slide to slide during the narration, you can use a trigger to jump to slide 1.2 when the audio ends. That sort of seamless flow won't irritate the learner with an interruption, or confuse them trying to figure out what to do, I would keep it flowing, and only break when I want the learner to interact with the slide or objects on it.
If you want to have that many objects moving and changing (and that sort of activity has been shown to strongly aid learning, as opposed to the way showing the same text as the audio detracts from it), then you are going to need a lot of objects and lots of changes. That's one of the distinctions between good design and great design: relevant, simple illustrations that complement the content.
If you want a little more organization of your objects and animations, you might consider layers. On each layer put a collection of related objects, with their associated triggers. Show the layer at the appropriate time, and set the animation timing to the layer timeline.
Ultimately, though, the ability to live with, keep track of, and organize large numbers of objects and actions is why they pay IDs the big bucks. Your only options are to cope with that, or produce "read this and click next" courses, and this example shows me which method you prefer. So dig in, and do it well.