EUROSPEECH 2001 Scandinavia
In this paper we address the problem of speaker adaptation in noisy environments. We estimate speaker adapted models from noisy data by combining unsupervised speaker adaptation with noise compensation. We aim at using the resulting speaker adapted models in environments that differ from the adaptation environment, without a significant loss in performance. The key idea is to separate speaker and environment variabilities and associate them to independent models. We show that linear models for both speaker and environment are critical for achieving this goal. Experiments for 2000 and 4000 isolated word tasks on real car noise show that unsupervised speaker adaptation combined with noise compensation can provide more than 20% error rate reduction compared with noise compensation only, and more than 50% error rate reduction compared with speaker adaptation only.
Bibliographic reference. Rigazio, Luca / Nguyen, Patrick / Kryze, David / Junqua, Jean-Claude (2001): "Separating speaker and environment variabilities for improved recognition in non-stationary conditions", In EUROSPEECH-2001, 2347-2350.