8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A Switching Linear Gaussian Hidden Markov Model and Its Application to Nonstationary Noise Compensation for Robust Speech Recognition

Jian Wu, Qiang Huo

University of Hong Kong, China

The Switching Linear Gaussian (SLG) Models was proposed recently for time series data with nonlinear dynamics. In this paper, we present a new modelling approach, called SLGHMM, that uses a hybrid Dynamic Bayesian Network of SLG models and Continuous Density HMMs (CDHMMs) to compensate for the nonstationary distortion that may exist in speech utterance to be recognized. With this representation, the CDHMMs (each modelling mainly the linguistic information of a speech unit) and a set of linear Gaussian models (each modelling a kind of stationary distortion) can be jointly learnt from multi-condition training data. Such a SLGHMM is able to model approximately the distribution of speech corrupted by switching-condition distortions. The effectiveness of the proposed approach is confirmed in noisy speech recognition experiments on Aurora2 task.

Full Paper

Bibliographic reference.  Wu, Jian / Huo, Qiang (2003): "A switching linear Gaussian hidden Markov model and its application to nonstationary noise compensation for robust speech recognition", In EUROSPEECH-2003, 977-980.