Auditory-Visual Speech Processing (AVSP) 2009

University of East Anglia, Norwich, UK
September 10-13, 2009

Comparison of Human and Machine-Based Lip-Reading

Sarah Hilder, Richard Harvey, Barry-John Theobald

School of Computing Sciences, University of East Anglia, UK

We investigate the performance of a machine-based lip-reading system using both shape-only parameters and full shape and appearance parameters. Furthermore, we contrast the performance of a machine-based lip-reading system with human lip-reading ability. We find that the automated system outperforms human lip-readers. Curiously however, for relatively simple tasks there is little improvement in recognition accuracy when adding full appearance features to the machine-based system, whereas for human lip-readers we observe significant improvements in performance. Finally, we measure the effect of ‘speaker training’ on human lip-reading ability and we find even very limited training is sufficient to improve performance.

Index Terms: automated lip-reading, speechreading

