converting speech to text

I am wondering if anyone has any experience with taking an audio file and converting it to text.  Currently, I play the file, hit pause, type what I heard and so on.  This is so time consuming.

I have seen a software called Dragon Speaking (or something similar) and wonder if it really works or if it is a super pain to train it to the voice being heard.

I am needing to do this conversion as I do not know how to create captioning that will play on the videos being used in my courses.  For every audio file/video, we have to provide a text script.

I look forward to hearing all of your suggestions.


8 Replies
Steve Flowers

Speech recognition software is pretty accurate. You'll still need to do some editing but it's pretty remarkable what you can do with speech recognition. Even on mobile devices it's remarkably accurate. 

Dragon Naturally Speaking is an industry leader in this area and offers a mature platform. But this comes at a relatively steep price. If you're doing alot of it this may be worth it. If not, you might be able to get by with a free version. These are typically driven from a Microphone input, but I'd imagine you can configure pick up from any source.

If you have Windows 7, here are the instructions for setting up and using the BUILT IN speech recognition in Windows 7. I believe this functionality is also available in other versions of Windows. I can't testify to it's ease of use or accuracy, but I'd imagine that it's worth the price

Adam Schwartz

There's a very cool service called SpeakerText that's doing this for video.

I think they'll take an ordinary audio file too if you want to give it a whirl. The advantage over Dragon or similar is that the accuracy would be very high.

Disclosure: I'm a recent investor in SpeakerText.

Brian Allen

Don't know if you're a Camtasia Studio user, but one of the new features that caught my eye with the newest version of the software is speech to text captioning:

I've got the newest version but haven't had a chance to play with this to see how good it is.  I saw it in action at DevLearn last year, and it looked pretty cool.  I had thought that it might be something I could use to easily create text captioning for all of my audio and video, even if I'm not using Camtasia in my Articulate presentation...