Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Using the Focus of Visual Attention to Improve Spontaneous Speech Recognition

Neil Cooke, Martin Russell

University of Birmingham, UK

We investigate recognition of spontaneous speech using the focus of visual attention as a secondary cue to speech. In our experiment we collected a corpus of eye and speech data where one participant describes a geographical map to another while having their eye movements tracked. Using this corpus we characterise the coupling between eye movement and speech. Speech recognition results are presented to demonstrate proof of concept for development of a bimodal ASR using focus of visual attention to drive a dynamic language model. Marginal improvement in WER is observed.

Full Paper

Bibliographic reference.  Cooke, Neil / Russell, Martin (2005): "Using the focus of visual attention to improve spontaneous speech recognition", In INTERSPEECH-2005, 1213-1216.