EUROSPEECH 2003 - INTERSPEECH 2003
The performance of speech recognition systems trained in quiet degrades significantly under noisy conditions. To address this problem, a Weighted Viterbi Recognition (WVR) algorithm that is a function of the SNR of each speech frame is proposed. Acoustic models trained on clean data, and the acoustic front-end features are kept unchanged in this approach. Instead, a confidence/robustness factor is assigned to the output observation probability of each speech frame according to its SNR estimate during the Viterbi decoding stage. Comparative experiments are conducted with Weighted Viterbi Recognition with different front-end features such as MFCC, LPCC and PLP. Results show consistent improvements with all three feature vectors. For a reasonable size of adaptation data, WVR outperforms environment adaptation using MLLR.
Bibliographic reference. Cui, Xiaodong / Bernard, Alexis / Alwan, Abeer (2003): "A noise-robust ASR back-end technique based on weighted viterbi recognition", In EUROSPEECH-2003, 2169-2172.