ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction

Chalapathi Neti, Giridharan Iyengar, Gerasimos Potamianos, A. Senior, Benoit Maison

We are exploiting the human perceptual principle of sensory integration (the joint use of audio and visual information) to improve the recognition of human activity (speech recognition, speech event detection and speaker change), intent (intent to speak) and human identity (speaker recognition), particularly in the presence of acoustic degradation due to noise and channel. In this paper, we present experimental results in a variety of contexts that demonstrate the benefit of joint audio-visual processing.


Cite as: Neti, C., Iyengar, G., Potamianos, G., Senior, A., Maison, B. (2000) Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 11-14

@inproceedings{neti00_icslp,
  author={Chalapathi Neti and Giridharan Iyengar and Gerasimos Potamianos and A. Senior and Benoit Maison},
  title={{Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 11-14}
}