ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Using the focus of visual attention to improve spontaneous speech recognition

Neil Cooke, Martin Russell

We investigate recognition of spontaneous speech using the focus of visual attention as a secondary cue to speech. In our experiment we collected a corpus of eye and speech data where one participant describes a geographical map to another while having their eye movements tracked. Using this corpus we characterise the coupling between eye movement and speech. Speech recognition results are presented to demonstrate proof of concept for development of a bimodal ASR using focus of visual attention to drive a dynamic language model. Marginal improvement in WER is observed.


doi: 10.21437/Interspeech.2005-371

Cite as: Cooke, N., Russell, M. (2005) Using the focus of visual attention to improve spontaneous speech recognition. Proc. Interspeech 2005, 1213-1216, doi: 10.21437/Interspeech.2005-371

@inproceedings{cooke05_interspeech,
  author={Neil Cooke and Martin Russell},
  title={{Using the focus of visual attention to improve spontaneous speech recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1213--1216},
  doi={10.21437/Interspeech.2005-371}
}