EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

Separating Speaker and Environment Variabilities for Improved Recognition in Non-Stationary Conditions

Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua

PSTL, USA

In this paper we address the problem of speaker adaptation in noisy environments. We estimate speaker adapted models from noisy data by combining unsupervised speaker adaptation with noise compensation. We aim at using the resulting speaker adapted models in environments that differ from the adaptation environment, without a significant loss in performance. The key idea is to separate speaker and environment variabilities and associate them to independent models. We show that linear models for both speaker and environment are critical for achieving this goal. Experiments for 2000 and 4000 isolated word tasks on real car noise show that unsupervised speaker adaptation combined with noise compensation can provide more than 20% error rate reduction compared with noise compensation only, and more than 50% error rate reduction compared with speaker adaptation only.

Full Paper

Bibliographic reference.  Rigazio, Luca / Nguyen, Patrick / Kryze, David / Junqua, Jean-Claude (2001): "Separating speaker and environment variabilities for improved recognition in non-stationary conditions", In EUROSPEECH-2001, 2347-2350.