EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

A Noise-Robust ASR Back-End Technique Based on Weighted Viterbi Recognition

Xiaodong Cui (1), Alexis Bernard (2), Abeer Alwan (1)

(1) University of California at Los Angeles, USA
(2) Texas Instruments Inc., USA

The performance of speech recognition systems trained in quiet degrades significantly under noisy conditions. To address this problem, a Weighted Viterbi Recognition (WVR) algorithm that is a function of the SNR of each speech frame is proposed. Acoustic models trained on clean data, and the acoustic front-end features are kept unchanged in this approach. Instead, a confidence/robustness factor is assigned to the output observation probability of each speech frame according to its SNR estimate during the Viterbi decoding stage. Comparative experiments are conducted with Weighted Viterbi Recognition with different front-end features such as MFCC, LPCC and PLP. Results show consistent improvements with all three feature vectors. For a reasonable size of adaptation data, WVR outperforms environment adaptation using MLLR.

Full Paper

Bibliographic reference.  Cui, Xiaodong / Bernard, Alexis / Alwan, Abeer (2003): "A noise-robust ASR back-end technique based on weighted viterbi recognition", In EUROSPEECH-2003, 2169-2172.