-    -    -    -     -    -    -    -     -    -    -    -     -    -    -    -    
Hearing Loss Products and Services
Advertise on Hearing Loss Web
Search This Site or the Web

Free Email Newsletter

Jobs, Jobs, Jobs

Hearing Loss Web Banner
Discussion Forum
In the News!
Last Update: Aug 19
-    -    -    -     -    -    -    -     -    -    -    -     -    -    -    -    
 
Home
About Us
Search
New to Hearing Loss?
In the News
Discussion Forum
HOH-LD-News
Advertise
Contact Us
Glossary
Events
 
Issues
Access
Oral Communications
Emergency Planning
Employment
Family
Hearing Aid Affordability
Identity
Law Enforcement
Psychological
Services
Medical
Audiology
Causes
Cures
Meniere's Disease
Tinnitus
Local Resources
Employment Opportunities
 
Education Opportunities
Hearing Loss Products and Services
Advocates and Legal
Captioning
Government
Hearing Aids
Hearing Aid Repair
Hearing Dogs
Hearing Loss Organizations
Hints and Tips
Publications
Technology
Alerting Devices
Assistive Listening Devices
Cochlear Implants
Hearing Aids
Speech Recognition
Telephones
Two Way Pagers
TTYs (TDDs)
Visual Communications
Links

Speech Recognition Status Report

May 2003

Editor: I've been singing the praises of Speech Recognition for a couple of years, ever since I used it in place of real time captioning to teach a computer class for people with hearing loss. That particular application was trained to my voice, and I occasionally had to repeat to get the software to "understand" what I was saying. But it was very workable. The newer version that I got a few months ago is noticeably better. And I expect the next version will be significantly better still.

At the other end of the Speech Recognition spectrum from the simple program I use are the ones that interact with thousands of customers without training on their specific voices. Those programs currently only work in situations where the number of possible responses is very limited, but it may not be too long before they are able to work in more general circumstances.

Michael Phillips is the Chief Technical Officer of SpeechWorks, a company that produces these commercial systems. He was recently interviewed by MIT's "Technology Review" (TR). Here are excerpts from that interview. The complete text is available at http://www.technologyreview.com/articles/focuson0603_speech.asp?p=0

~~~~~~~~~~~~~~~~~~~

Michael Phillips is Chief Technology Officer for SpeechWorks. He spoke with Technology Review Senior Editor Wade Roush about his company's interactive voice-response technology, which automates the handling of customer calls at companies like United Airlines and Federal Express. With a father's pride, Phillips introduced Tom, a jaunty voice with an American accent who is one of SpeechWorks' synthesized-speech "personas." (Tom's colleagues Helen and Karen sound like real women from Britain and Australia, respectively, and personas speaking in many other languages and accents are available.)

Phillips, who co-founded SpeechWorks in 1994 to commercialize language-processing software he had helped to build at MIT's Laboratory for Computer Science, talked about the company's plans for making such speech-driven interfaces the dominant way we interact with computers.

[snip]

TR: Is the technology really getting that good that fast?

PHILLIPS: The technology is improving rapidly. We're sort of in the Moore's Law of speech recognition. We cut error rates in the speech recognizer by 20 or 30 percent every year.

TR: What would you say the rough error rate was when you started in 1994, and what is it now?

PHILLIPS: It depends on the task. And as the speech recognizer gets better and better, we do more and larger and more complex tasks. So a better measure than just what is the accuracy on a fixed task, is what kinds of tasks you can get acceptable accuracy on. When we first started, it was basically a few hundred-word vocabulary. Things like getting a phone number from a user, or even getting a city name were possible, but stretching it. Since then we've deployed stock trading and stock quote systems that have 50,000- to 100,000-word vocabularies. Most of the applications we have exploited are not constrained by the quality of the speech recognition so much as by the user interface. We are doing very sophisticated things like entering any street address in the country, entering any name you have, even something like getting an e-mail address from somebody over the telephone.

[snip]