Computers Are Better Lipreaders than Humans!
September 2009
Editor: A new study by the University of East Anglia suggests computers
are now better at lip-reading than humans. But I bet that depends on which
humans you test - and which computers! In any case, here's the notice from
the folks at the University of East Anglia.
~~~~~~~~~~~~~~~~~
The peer-reviewed findings will be presented for the first time at the
eighth International Conference on Auditory-Visual Speech Processing (AVSP)
2009, held at the University of East Anglia from September 10-13.
A research team from the School of Computing Sciences at UEA compared
the performance of a machine-based lip-reading system with that of 19
human lip-readers. They found that the automated system significantly
outperformed the human lip-readers - scoring a recognition rate of 80 per
cent, compared with only 32 per cent for human viewers on the same task.
Furthermore, they found that machines are able to exploit very
simplistic features that represent only the shape of the face, whereas
human lip-readers require full video of people speaking.
The study also showed that rather than the traditional approach to
lip-reading training, in which viewers are taught to spot key lip-shapes
from static (often drawn) images, the dynamics and the full appearance of
speech gestures are very important.
Using a new video-based training system, viewers with very limited
training significantly improved their ability to lip-read monosyllabic
words, which in itself is a very difficult task. It is hoped this research
might lead to novel methods of lip-reading training for the deaf and hard
of hearing.
"This pilot study is the first time an automated lip-reading system has
been benchmarked against human lip-readers and the results are perhaps
surprising," said the study's lead author Sarah Hilder.
"With just four hours of training it helped them improve their
lip-reading skills markedly. We hope this research will represent a real
technological advance for the deaf community."
Agnes Hoctor, campaigns manager at the RNID, said: "This research
confirms how difficult the vital skill of lip-reading is to learn and why
RNID is campaigning for people who are deaf or hard of hearing to have
improved access to classes. We would welcome the development of
video-based or online training resources to supplement the teaching of
lip-reading. Hearing loss affects 55 per cent of people over 60 so, with
the ageing population, demand to learn lip-reading is only going to
increase."
The AVSP conference is being held in the UK for the first time since
its inception in 1998. The University of East Anglia will host cutting
edge researchers including psychologists, engineers, scientists and
linguists from as far afield as Australia, Canada and Japan.
As part of the conference, delegates will take part in a Visual Speech
Synthesis Challenge in which a number of visual speech synthesizers, or
'talking heads', will battle it out to determine the most intelligible and
visually appealing system.
AVSP runs as a satellite conference to Interspeech 2009 which will be
held in Brighton. Topics under discussion will include: machine
recognition of audiovisual speech; the role of gestures accompanying
speech; modeling, synthesis and recognition of facial gestures; and speech
synthesis.
Keynote speakers will be Dr Peter Bull of the University of York who
will be exploring The Myth of Body Language and Prof Louis Goldstein of
the University of Southern California whose presentation is entitled
Articulatory Phonology and Audio-Visual Speech.
Comparison of human and machine-based lip-reading by Sarah Hilder,
Richard Harvey and Barry-John Theobald is published in the Proceedings of
the International Conference on Auditory-Visual Speech Processing (AVSP)
2009 on Thursday September 10 2009.
The research will be presented on Saturday September 12 at the
International Conference on Auditory-Visual Speech Processing (AVSP) 2009
at the University of East Anglia.
For more information about the conference, please visit
www.avsp2009.co.uk.
Part of the lip-reading test used to compare the performance of the
machine-based lip-reading system and human lip-readers can be downloaded
here: http://www.jtuk.com/training/part1.html