5th International Conference on Spoken Language Processing
In this paper, a novel architecture, which integrates the recurrent neural network (RNN) based compensation process and the hidden Markov model (HMM) based recognition process into a unified framework, is proposed. The RNN is employed to estimate the additive bias, which represents the telephone channel effect, in the cepstral domain. Compensation of telephone channel effects is implemented by subtracting the additive bias from the cepstral coefficients of the input utterance. The integrated recognition system is trained based upon MCE/GPD (minimum classification error/generalized probabilistic descent) method with an objective function that is designed to minimize recognition error rate. Experimental results for speaker-independent Mandarin polysyllabic word recognition show an error rate reduction of 21.5% compared to the baseline system.
Bibliographic reference. Chang, Sen-Chia / Chien, Shih-Chieh / Kuo, Chih-Chung (1998): "AN RNN-based compensation method for Mandarin telephone speech recognition", In ICSLP-1998, paper 1077.