7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Audio-Visual Speech Enhancement with AVCDCN (Audio-Visual Codebook Dependent Cepstral Normalization)

Sabine Deligne, Gerasimos Potamianos, Chalapathy Neti

IBM T.J. Watson Research Center, USA

In this paper, we introduce a non-linear enhancement technique called Audio-Visual Codebook Dependent Cepstral Normalization (AVCDCN) and we consider its use with both audio-only and audiovisual speech recognition.

AVCDCN is inspired from CDCN [1] [2], an audio-only enhancement technique that approximates the non-linear effect of noise on speech with a piece-wise constant function. Our experiments show that the use of visual information in AVCDCN allows significant performance gains over CDCN.


Full Paper

Bibliographic reference.  Deligne, Sabine / Potamianos, Gerasimos / Neti, Chalapathy (2002): "Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)", In ICSLP-2002, 1449-1452.