Technology for the deaf to go mainstream
By Peter Abrahams
Editor: Voice recognition (VR) technology has made significant progress
in the past few years - so much so that it is becoming a mainstream
captioning solution. IBM's Via Voice has long been one of the premier VR
solutions, and it looks like ViaScribe may be their next commercial VR
technology.
Peter Abrahams has long been involved with accessibility technology and
has written over 200 articles on a variety of topics. He is currently a
Practice Leader with Bloor Research. For more information on Peter please
point your browser to: http://www.it-analysis.com/about/author.php?id=47
Our thanks to Peter and Bloor Research for permission to share this
article.
~~~~~~~~~~~~~~~~~~~
Technology that was initially designed to help deaf university students
has a potentially much wider use.
How does a university student who is profoundly deaf or just hard of
hearing cope with lectures where the spoken word is the main, if not the
only, communication mechanism? Until recently with the help of some
sub-optimal solutions:
- Reading the lips of the lecturer is not really an option as the student
needs to be very close and the lecturer has always got to be facing them.
- Having an interpreter attend all the lectures and either sign or
lip-speak is expensive, is difficult to organise as there are a limited
number of interpreters, and it makes the student very dependent on a third
party. This is only practical for major lectures or conferences where many
deaf people may be attending.
- Using a stenographer to type the lecture in real time has the same
limitations as an interpreter, but does have the added benefit of a
providing a permanent written transcript that can be used by the deaf
student, as well as others, later.
- Having the lecturer provide notes before the lecture is not natural for
the lecturer nor will it reflect the dynamic nature of a live lecture.
None of these solutions are practical for an average hearing-impaired
student. In 1999 some Universities and IBM set up the Liberated Learning
Consortium to see how technology could improve the situation. IBM had a
speech recognition product called ViaVoice and the consortium was set up to
see if it could be used in the lecture environment.
Some initial trials showed that the technology had potential but there
were some major issues the first being that the lecturer was not adding in
any punctuation ViaVoice was designed for dictation and the person dictating
would add in commands like comma full stop new paragraph so you landed up
with text that looks like this paragraph which is very difficult to read in
real time even worse it would pick up commands such as save and close and
close down the application in the middle of the lecture.
So keeping the same speech recognition engine...
Some modifications were made...
Firstly to stop the system recognising and acting on commands...
Secondly to recognise that lecturers speak in small chunks with pauses in
between...
And laying out these chunks on separate lines with ellipses at the end...
Made the text much easier to read...
Various options were tried and tests are still going on but this format
seems to work well...
As we hope you can see from this example.
The technology proved itself in the real-time environment but it was not
really practical because it was difficult for the lecturer to do the initial
training and set up.
For the solution to really work the recognition rate has to be in the
high 90s. This is difficult to achieve and that can only happen if the
speech recognition engine can learn the lecturer's voice. Lecturers are busy
people and would be willing to spend up to a couple of hours training the
system, but with the specialist language this was not normally enough. The
solution was to record the audio of the lecture, let the engine do its best
and then have an editor correct its mistakes. After a few live lectures the
recognition rate dramatically improves into the 90s.
The set up of the equipment at the beginning of the lecture was automated
so reducing an unnecessary burden on the lecturer.
This technology means that a transcript of the lecture can be provided in
real-time and can be displayed on a screen in the lecture hall. It is also
possible to transmit it to portable devices, such as a PDA, to individual
students. The deaf student is now able to follow the lecture, just like any
other student, so with this technology the student is no longer disabled.
In a sense the student is now more able than an unimpaired student. How
often, when you are listening to a lecture, have you wanted an instant
replay of the last few sentences, either because you did not fully
understand first time, or because you were momentarily distracted and did
not hear everything? The technology display shows the last few chunks so a
student with the technology is more able.
Even more important than the instant replay is the fact that at the end
of the lecture there is a complete transcript available on-demand. This is
not just a boon for the hearing impaired students but for all students.
However good students are at note taking, and many are not, the ability to
go back to the lecture and read the transcript of any section will improve
the learning experience.
The technology has been developed much further by enabling the transcript
to be synchronised with audio or video recordings and with any PowerPoint
presentation material. Over the web and on-demand a student can then see the
transcript alongside the appropriate audio, video or slide. This is a
complete solution for the hearing impaired student and also provides
completely new opportunities to other students such as:
- 'Attending' a lecture that they could not attend live.
- Distance learning.
- Foreign students often find it easier to read than listen; the
combination of the audio with the transcript is a powerful learning tool.
- Students with learning difficulties can benefit from the ability to
replay and to choose the media.
- All students benefit from the ability to have a search engine find
relevant sections of the transcripts.
The technology has been named IBM ViaScribe. Up to this point all the
development has revolved around the university student and lecture
environment. It has now been developed sufficiently to consider other
environments where it could be beneficial. Some research has started into
the school environment with some initial success and some new challenges.
However, research into its use in pubic sector organisations and private
enterprises suggests it will have enormous potential.
A study at RBC Financial Group identified the following opportunities:
Client Transaction
A client who is deaf, or hard of hearing, or for whom English is a second
language, requests services from an RBC employee. The RBC employee, trained
in the use of IBM ViaScribe, speaks naturally to the client. The software
simultaneously transcribes the conversation, making it available as a text
display, and creating a text copy for the client. The synchronous audio and
text transcript can also be used as a training tool to study
employee/customer interactions.
Classroom/Trainer
Multi-media lecture/presentation notes for in-house training and
accreditation sessions can be made available via the web. Real time
captioning of the presentation materials is also possible. Using IBM
ViaScribe in this way creates access for deaf/hard of hearing participants
and also creates an additional learning channel for non-English speakers.
Teleconference enhancement
Real time captioning of calls can eliminate comprehension problems
commonly associated with poor audio quality during teleconferences. Record
keeping is also enhanced as the transcript verifies the content for
participants in real time, can be used to generate meeting minutes, and
allows full access to the content for those who are either late or unable to
attend.
Webcast transcription
Existing webcasts could be captioned, complete with speaker
identification.
ViaScribe is not yet available as a commercial product but any
organisation that believes they could benefit from this technology should
contact the Liberated Learning Consortium.
It is wonderful to see research into accessibility creating a tool that
will be valued by all members of society. I believe that the use of this
type of technology will become commonplace over the next five years.