EUROSPEECH 2003 - INTERSPEECH 2003
Impulsive noise usually introduces sudden mismatches between the observation features and the acoustic models trained with clean speech, which drastically degrades the performance of automatic speech recognition (ASR) systems. This paper presents a novel method to directly suppress the adverse effect of impulsive noise on recognition. In this method, according to the noise sensitivity of each feature dimension, the observation vector is divided into several subvectors, each of which is assigned to a suitable flooring threshold. In recognition stage, observation probability of each feature sub-vector is floored at the Gaussian mixture level. Thus, the unreliable relative probability difference caused by impulsive noise is eliminated, and the expected correct state sequence recovers the priority of being chosen in decoding. Experimental evaluations on Aurora2 database show that the proposed method achieves the average error rate reduction (ERR) of 61.62% and 84.32% in simulated impulsive noise and machinegun noise environment, respectively, while maintaining high performance for clean speech recognition.
Bibliographic reference. Ding, Pei / Shi, Bertram E. / Fung, Pascale / Cao, Zhigang (2003): "Flooring the observation probability for robust ASR in impulsive noise", In EUROSPEECH-2003, 1777-1780.