27 June 2008

Future Near: Universal Speech Recognition

In a discussion on Enda Guinan's blog what began as a conversation regarding trying to explain to people what we [those of us in AT-related jobs] actually do when we go to work transferred over to a conversation involving speech recognition, which is one of the things which provides the wow factor when we demonstrate. From there, we got into the question of universal speech recognition, as in the question we are always asked, "is there a way that I can listen to the professor and have that converted into text." The answer is, "no, not really." But there is another answer, "the future is almost here." Very soon now we might be able to start saying yes.

Actually, we have been able to convert what the prof, or lecturer, or teacher is saying into text. It has just been difficult. The Liberated Learning Consortium has been doing this for a decade, and five years ago at an American Community College I outfitted a few deaf students with laptops equipped with ViaVoice speech recognition software, and their instructors with wireless microphones linked to receivers on those laptops. We got the instructors to train their voices on ViaVoice, and then, whatever they said in class arrived in a Word doc on the students' laptops. The accuracy was great, but the words came unpunctuated, which drove half of the students crazy (this is part of what the complex Liberated Learning system has tried to solve). And anything any other student said was, of course, lost. And... yes, getting the faculty to participate was not easy.

The world, however, is changing. The first paragraph of this post was dictated through jott.com. I have "fixed" it, but I have shown you where I fixed it. Green means that jott added a (?) and got the word wrong. Purple means that jott added a (?) and got the word right. Red is punctuation which I had to add. It isn't perfect - it never will be. Enda's name came out "___ duh _______" which is not correct. And yet, it is mostly correct, and the punctuation is there.

So now you can see speech recognition accuracy without voice training. Now you know where we will be very soon.

This is important. It means that we are perhaps only a year or two away from truly being able to have almost everything said in a classroom transcribed and available to those with hearing, attention, and learning issues and differences. That will make everything different for a whole range of kids - but let me focus on how this will change education for everyone.

When I have taught online courses two differences appear. First, online teaching is really hard - you can never "wing it" - everything has to be prepared and it is much more work to monitor online discussions than real-life ones. But second, you have this extraordinary record of what was said and who said it, what was discussed, what was asked, what was misunderstood, what was very difficult. It is all there, and not just fragmented in memory. You can go back and say, "wow! that didn't work," or you can say, "look at this, I really need to mediate this better." Perhaps more importantly, students can go back and say, "did I hear that right?" "did she say what I thought she said?" "could I have said that better?"

One promise of universal speech recognition is that ability to bring one of the best features of online learning into face-to-face learning. And bringing that in will enable a teaching and studying revolution.

It is close. Very close. Try jott.com today. Get a bunch of your friends to try it. And then start imagining what you could do with this kind of power in your classroom.

* jott.com is North America only for the moment. SpinVox is available in the UK and Ireland, but it is not inexpensive.

- Ira Socol

Brian S. Friedlander, Ph.D said...

Hi Ira:

You are right on- I have been showing others about Jott and it is incredible to see their reactions. I am personally using Jott to post to my Blogger account and a WordPress account that I have as well as using it for posting to Twitter. It is great to see that Jott is a mainstream application- I know that it makes our heads spin to think of all the applications going forward for this technology. Keep up your thought provoking work- it is greatly appreciated.

Brian S. Friedlander

Lon said...

Hi Ira,
I too love Jott and have been using it to post to my Google calendar. I haven't quite known whether to trust it to make my posts for me because of errors. I like how you demonstrated the use of it in your blog. Thanks for your post.

Whatley said...

Not sure if you know, but SpinVox is also available in NAM.

Drop me note if you'd like to try it out, happy to assist.


James Whatley

Jim Dornberg said...

Jott relies on people in India to transcribe your messages, it isn't true computerized "speech-to-text." And, unfortunately, Jott is now out of beta, and I don't know to what extent it will be free anymore.