INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Feature Compensation for Speech Recognition in Severely Adverse Environments Due to Background Noise and Channel Distortion

Wooil Kim, John H. L. Hansen

University of Texas at Dallas, USA

This paper proposes an effective feature compensation scheme to address severely adverse environments for robust speech recognition, where background noise and channel distortion are simultaneously involved. An iterative channel estimation method is integrated into the framework of our Parallel Combined Gaussian Mixture Model (PCGMM) based feature compensation algorithm [1]. A new speech corpus is generated which reflects both additive and convolutional noise corruption. The channel distortion effects are obtained from the NTIMIT and CTIMIT corpora. Evaluation of objective speech quality measures including STNR, PESQ, and speech recognition shows that the generated speech corpus represents highly challenging acoustic conditions for speech recognition. Performance evaluation of the proposed system over the obtained speech corpus demonstrates that the proposed feature compensation scheme is significantly effective at improving speech recognition performance with presence of both background noise and channel distortion, comparing to the conventional methods including the ETSI AFE.

Reference

  1. W. Kim and J.H.L. Hansen, “Feature Compensation in the Cepstral Domain Employing Model Combination,” Speech Comm., 51(2), pp.83-96, 2009.

Full Paper

Bibliographic reference.  Kim, Wooil / Hansen, John H. L. (2011): "Feature compensation for speech recognition in severely adverse environments due to background noise and channel distortion", In INTERSPEECH-2011, 1653-1656.