EUROSPEECH 2003 - INTERSPEECH 2003
The low level acoustico-visual association reported by Yehia et al. ( Speech Comm., 26(1):23-43, 1998) is exploited for audio-visual speech enhancement with natural video sequences. The aim of this study is to demonstrate that the redundant components of AV speech are extractible with a suitable representation which does not involve any categorization process. A comparative study is achieved between different types of audio features, including the initial Line Spectral Pairs (LSP) and 4-subbands envelope energy. A gain measure of the enhancement is applied for the comparison. The results clearly show that the coarse envelope features allows a better gain than the LSP.
Bibliographic reference. Berthommier, Frédéric (2003): "Audiovisual speech enhancement based on the association between speech envelope and video features", In EUROSPEECH-2003, 1045-1048.