Nonnative speech recognition is becoming more and more important as many speech applications are deployed world wide. Meanwhile, due to the large population of nonnative speakers, speaker adaptation remains the most practical way for providing high performance speech services. Subspace Gaussian Mixture Model (SGMM) has recently been shown to yield superior performance on various native speech recognition tasks. In this paper, we investigated different speaker adaptation techniques of SGMM for nonnative speech recognition. A two-stage direct model adaptation approach has been proposed based on the analysis of SGMM model parameter functionalities. Our initial experiments have also verified that the proposed approach is much more effective than the traditional feature-space Maximum Likelihood Linear Regression(MLLR) on SGMM based nonnative speaker adaptation tasks.
Index Terms: Speaker Adaptation, Nonnative Speech Recognition, Subspace Gaussian Mixture Model
Bibliographic reference. Li, Bo / Sim, Khe Chai (2012): "A two-stage speaker adaptation approach for subspace Gaussian mixture model based nonnative speech recognition", In INTERSPEECH-2012, 1772-1775.