Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Robust Connected Word Speech Recognition Using Weighted Viterbi Algorithm and Context-Dependent Temporal Constraints

Nestor Becerra Yoma (1,2), Lee Luan Ling (1), Sandra Dotto Stump (2)

(1) DECOM/FEEC/UNICAMP, Campinas-SP, Brazil
(2) Mackenzie University, Sao Paulo-SP, Brazil

This paper addresses the problem of connected word speech recognition with signals corrupted by additive and convolutional noise. Context-dependent temporal constraints are proposed and compared with the ordinary temporal restrictions, and used in combination with the weighted Viterbi algorithm which had been tested with isolated word recognition experiments in previous papers. Connected-word recognition tests show that the weighted Viterbi algorithm depends on the accuracy of the state duration modelling and the approach here covered can lead to reductions as high as 90 or 95% in the error rate at moderate SNR using spectral subtraction, an easily implemented technique, even with a poor estimation for noise and without using any information about the speaker. It is also shown that the weighting procedure can reduce the error rate when cepstral mean normalization is also used to cancel both additive and convolutional noise.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Yoma, Nestor Becerra / Ling, Lee Luan / Stump, Sandra Dotto (1999): "Robust connected word speech recognition using weighted viterbi algorithm and context-dependent temporal constraints", In EUROSPEECH'99, 2869-2872.