Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Temporal Constraints in Viterbi Alignment for Speech Recognition in Noise

Nestor Becerra Yoma (1,2), Lee Luan Ling (1), Sandra Dotto Stump (2)

(1) DECOM/FEEC/UNICAMP, Campinas-SP, Brazil
(2) Mackenzie University, Sao Paulo-SP, Brazil

This paper addresses the problem of temporal constraints in the Viterbi algorithm using conditional transition probabilities. The results here presented suggest that in a speaker dependent small vocabulary task the statistical modelling of state durations is not relevant if the max and min state duration restrictions are imposed, and that truncated probability densities give better results than a metric previously proposed [1]. Finally, context dependent and context independent temporal restrictions are compared in a connected word speech recognition task and it is shown that the former leads to better results with the same computational load.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Yoma, Nestor Becerra / Ling, Lee Luan / Stump, Sandra Dotto (1999): "Temporal constraints in viterbi alignment for speech recognition in noise", In EUROSPEECH'99, 2861-2864.