18 February 2009

A Primer - Text-To-Speech

A raucus debate on a national (US) list serve regarding the use of Text-To-Speech software included much argument about whether digital reading was actually "reading." But that's for a post coming soon. But one thing which surprised me was how little the literacy educators who argue that this software is either ineffective or bad actually know about the systems they oppose.

So I posted a "starter list," which I'll re-post - with a few additions - right here.

Let me try to present, in relatively short form, how some of these reading technologies operate, from the basics of Click-Speak to the sophisticated literacy and study support of WYNN.

I don't want to try to offer a definitive list, rather, a few systems that I tend to use the most, including - hopefully - a small sense of why this or that has worked for the people - youth or adult - who I have worked with.

I emphasize that I use multiple technologies myself, and that most of those that I have worked with have chosen to do the same. This is a big part of what I describe as "Toolbelt Theory" - a learned-centered, Task-based, specifically ordered, reworking of Joy Zabala's "SETT" Assistive Technology choice protocol.
A Toolbelt for a Lifetime
Toolbelt Theory for Everyone
"Toolbelt Theory" is based around the idea that as humans we are tool users, and that we choose tools most effectively when that choice begins with the Task at hand, and then considers the Environment in which that task must be performed, the Skill set of the individual (the tool chooser), and the Tools which are available (Thus the acronyn TEST).

So, we have different literacy solutions based on the variety of challenges we face, just as we have different screwdrivers and different saws.

With that said, here are some of those tools:


Click-Speak is a free Firefox add-in created by Charles L. Chen (who also created FireVox for the visually impaired). Click-Speak can read any text which appears in your browser, including, of course, anything in on-line software such as Google Docs or any email. Click-Speak has its own "toolbar" (which I do not use) which puts three square "buttons" in the upper left corner of your browser screen - White for "Read Selected Text" - Green for "Automatically Read the Whole Page" - Red for "Stop Reading." It is that simple. You can also, as I do, put those three buttons into your Firefox Bookmark Bar, cleaning up the look of your window. Or you can access these functions by right clicking anywhere on the page or on any selected text.

"Click Speak Options" - in your Firefox tool menu - allows you to choose 5 diffeent speed settings and five different voice pitches. It uses "robotic sounding" synthesized speech. Click-Speak does not highlight individual words as they are read, but does highlight sentences when reading automatically and does scroll the page.

I use Click-Speak for short web reading. Newspaper articles, emails, information I'm looking up. I also use it extensively with email and Google docs to check text I have typed or dictated, since I am much more likely to catch missing words or badly structured sentences if I listen to my writing rather than try to re-read it.I really never mind the voice, though "actor read" audiobooks have spoiled me, and I'm unlikely to try to read much 'great literature' - say, books from the Literature Network, this way.

In schools, as I've mentioned in earlier messages here, I've seen this used in many ways. Some students simply highlight unfamiliar words and right click to "read selected text" so they can hear it - significantly improving their chances of recognizing that word the next time. Other students have used it for reading whole pages of digital social studies texts, etc. We have used it to support the independent editing of student writing with Google Docs and seen dramatic results.

Among adults, I personally know about 50 people who use this as their primary reading tool, at work and at home, including things like bills now digitally sent to them. It is highly effective in employment situations because it is perfect for short readings - such as instructions, supervisor messages, time schedules, etc. It can easily be added to the free AccessApps usb stick system, and can this be carried with you.

Microsoft Reader

This is a Windows only system (many free solutions are, it is much, much harder to build third party accessibility solutions for Apple O/S than for Microsoft or Linux), and it is not widely promoted, but I love it, and so do many of those who have tried it. Microsoft Reader gives you a 'book-looking' page on your computer, reads with word-by-word highlighting, allows you to place bookmarks, to highlight text, to take notes (linked to specific places in the text, which thus allows its use as a testing system), you can even draw pictures, and, if using it on a tablet PC, draw on the page itself (a feature that we're starting to see has interesting impacts on ASD users and some dyslexics).

MS Reader has a somewhat clunky install. You have to download the basic program - for Desktop or Laptop - or - for Tablet PC - then download the Text-to-Speech (TTS) software - then the Dictionaries (the Encarta if you want right-click word definition) - and finally, right-now for Word2000 or Word2003 only, the "RMR" tool, which puts a toolbar icon in your Microsoft Word which allows instant conversion of any text into the E-book format - Microsoft folks assure me that a Word2007 version is on the way.

MS Reader puts your books in the "MyLibrary" folder in your Windows "My Documents" folder. There are thousands of books already available in this format, all the classics are at the University of Virginia - but with Word2003 it is incredibly easy to put any text (with or without pictures) into this format, and even very young kids seem to love it (they can read over and over again even if mom or dad isn't available.).

It does have, again, the "robot voice," but you can control speed, the pitch, type size, make the page full-screen, have it speak all the controls (or not).

I use this for books. I use it for some academic papers. Many teachers use it for accessible testing (because it is free), and I use it for assessment at Voc/Rehab. In employment we've put employee handbooks, etc, into this form. We've put training manuals into this form. We've even put social service agency intake instructions into this form.

It is not very "flexible" - but it is a wonderful accessible system.

Adobe Reader 8 or 9

The speech function ("Read Out Loud") in Adobe Reader 8 or 9 has been great for me and others faced with the flood of Acrobat Docs. Even if we do not use it for reading vast amounts of text we use it to see if the article is worth the time, or to grab chunks of text in classes or meetings. To use Read Out Loud, opeen your document and go to the View Menu. Go down to "Read Out Loud" then choose "Activate Read Outloud." (this takes time on long documents, so it is something you want to do in advance) The go back to -View- Read Out Loud - and choose "read entire page" or "read to end of document" (from your cursor location), then "stop" when you are done. Free.


Dial2Do allows non-readers to send text messages. For free. Sign up at http://www.dial2do.com/ and then call their number and speak your text and tell them where to send it. An essential employment tool for those who struggle with print, also for anyone who drives. (see SpinVox in the UK, also available with most US cellphone systems)


AbbyMe is the reverse of Dial2Do, it allows "you" (or a boss) to text a voice message to a non-reader. For free. Sign up, or just use the service, at http://www.abbyme.com/ You type the message, AbbyMe calls the person and speaks it.

Mobile Phone Camera Text-To-Speech
Yes, Nokia and Kurzweil have an expensive and wonderful joint effort, a phone which takes a picture of a document and converts it to text which is then read aloud. It is great stuff but it also costs something north of $1,500 (US). There are, however, three free solutions:

Take a picture of a document or whiteboard with your 2mp or better camera phone, send it to ScanR, and for $3 or $5 a month, get accessible text sent back to your phone or email.

Or do the same with Qipit.

Or, for the more tech savvy, add TopOCR to your phone and do it yourself.


Freedom Scientific's WYNN - is a full-featured literacy and study skills solution which (finally) made university possible for me (on try number 4) and which gets me through grad school. Because it completely changed my life when I found it early in 1997 I can't sound "neutral" about it, but it is a system I have used with four year olds and a system which has allowed - just in my personal experiences - hundreds of students to succeed in college and hundreds more in advanced post-secondary career training.

WYNN is not free. There are great deals on school network or multi-user purchases, but the retail price for the scanning version is almost $1,000. I refuse to say this is expensive. $1,000 is very little for, say, one student I worked with who has an "ink-on-paper" reading level of <1.0, but who also now holds a masters degree in history.

WYNN allows you to bring any text into it. You can import digital text or just paste it into a blank document. Or you can scan materials in from your scanner, or "virtually scan" in documents such as old, non-accessible, Acrobat docs. When you scan something in (real scan or virtual) you get a magical choice - you can see "text only" or you can see the page in "exact view" - exactly as it looks, which does great things for textbooks, for books with tables, diagrams, etc.

In either view, WYNN has "two-level highlighting," you see each word highlighted as it is read and you see the sentence, line, or paragraph highlighted as well. In either view you can right click for definitions, word spelling, or to insert notes by typing or speaking. No software anywhere does a better job of reading diagrams than WYNN which makes it a fabulous STEM solution and, in my experience, has made it the perfect solution for vocational training.

You can edit the text within documents (say, if scanning has made an error), but instructors can lock out that feature (or use the specific Test-Talker version). You can also write in WYNN with full word processing features including predictive spelling if you choose. It has bookmarks, highlighting, etc.

WYNN is incredibly flexible. Everything about the text - size, font, color, background colors, space between letters, words, lines, is hugely variable. Speed of speech, pitch, choices of much more sophisticated voices, are all massively variable. You can alter word pronunciation within a document (to, for example, speak names accurately).

WYNN also has a full Internet-Explorer-based web browser which gives you all the same choices.

The one "knock" on WYNN has been that its interface does not "very adult." It is icon-based. (You can choose all kinds of degrees of verbosity - whether instructions and button names, or even typing, is spoken by the software.) But I have found this to be an advantage. WYNN is a lifespan solution, simple enough (especially if you build a simplified "custom" toolbar) for pre-schoolers who don't have reading parents at home, and completely at home in universities, because of all the supports.

The WYNN voice is still - obviously - synthesized, but the sound is much more natural than that in either of the above. It does take some significant "play time" for users to make their choices among all the settings, but one can "set" those, and any computer (or network) can hold onto an almost unlimited group of user settings. I find that I have actually created more than one user profile for myself. I have "easier reading" settings, and "harder reading" settings for those arcane journal articles I read so often. The "harder" settings put fewer words on a line, and read a bit slower.

Of course there are many other choices, from Read-and-Write-Gold to Natural Reader, from the Reading Pen to the free WordTalk and DSpeech, but I hope this suggests a few places to start.

- Ira Socol

1 comment:

James Miranda said...

Hello Ira,
I am a grad assistant at Western Michigan University, and we are starting a project for the provost in the hopes of updating our schools teaching models to more appropriately serve incoming students. I would love to speak with you and ask a few questions to get me started in the right direction.