Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Robust Speech Recognition Based on Noise and SNR Classification - A Multiple-Model Framework

Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, BÝrge Lindberg

Aalborg University, Denmark

This paper presents a multiple-model framework for noise-robust speech recognition. In this framework, multiple HMM model sets are trained - each identified by a noise type and a specific Signal-to-Noise Ratio (SNR) value. This, however, does not increase the computational complexity of the recognition process since only one model set is selected according to the noise classification and SNR estimation. The optimal number of model sets is first identified on the basis of the Aurora 2 database. With only three model sets for each noise type, the framework shows superior performance to Multi-style TRaining (MTR) when testing on known noise types but lower performance on unknown noise types. To overcome this drawback, a modified Jacobian method is proposed to adapt the selected HMM models to the test environment. Furthermore, given the fact that MTR often gives relatively stable performance for unknown noise types, a combined technique is applied in which interpolation between the MTR and the adapted models is performed. This combined technique gives more than 24% performance improvement as compared to MTR.

Full Paper

Bibliographic reference.  Xu, Haitian / Tan, Zheng-Hua / Dalsgaard, Paul / Lindberg, BÝrge (2005): "Robust speech recognition based on noise and SNR classification - a multiple-model framework", In INTERSPEECH-2005, 977-980.