ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Enhancing audio speech using visual speech features

Ibrahim Almajai, Ben Milner

This work presents a novel approach to speech enhancement by exploiting the bimodality of speech and the correlation that exists between audio and visual speech features. For speech enhancement, a visually-derived Wiener filter is developed. This obtains clean speech statistics from visual features by modelling their joint density and making a maximum a posteriori estimate of clean audio from visual speech features. Noise statistics for the Wiener filter utilise an audio-visual voice activity detector which classifies input audio as speech or nonspeech, enabling a noisemodel to be updated. Analysis shows estimation of speech and noise statistics to be effective with human listening tests measuring the effectiveness of the resulting Wiener filter.


doi: 10.21437/Interspeech.2009-576

Cite as: Almajai, I., Milner, B. (2009) Enhancing audio speech using visual speech features. Proc. Interspeech 2009, 1959-1962, doi: 10.21437/Interspeech.2009-576

@inproceedings{almajai09_interspeech,
  author={Ibrahim Almajai and Ben Milner},
  title={{Enhancing audio speech using visual speech features}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1959--1962},
  doi={10.21437/Interspeech.2009-576}
}