Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Speech Recognition Using Context Conditional Word Posterior Probabilities

Ralf Schlüter, Frank Wessel, Hermann Ney

Lehrstuhl für Informatik VI, Computer Science Department, Aachen University of Technology, Germany

In this paper two new scoring schemes for large vocabulary continuous speech recognition are compared. Instead of using the joint probability of a word sequence and a sequence of acoustic observations, we determine the best path through a word graph using posterior word probabilities with or without word context. The exact calculation of the posterior probability for a word sequence implies a sum over all possible word boundaries, which is approximated by a maximum operation in the standard scoring approach. The new scoring scheme using word posterior probabilities could be expected to lead to improved recognition performance, because it involves partial summation over word boundaries. We present experimental results on five different corpora, the Dutch Arise corpus, the German Verbmobil ’98 corpus, the English North American Business ’94 20k and 64k development corpora, and the English Broadcast News ’96 corpus. It is shown that the Viterbi approximation within words has no effect on standard and word posterior based recognition. Using word posterior probabilities with and without word context, the relative reduction in word error rate is comparable and ranges between 1.5% and 5%. A reason why the additional consideration of word context does not further improve the recognition performance might be that the increase in word context information is traded against a decrease in the number of word sequences that contributes to a particular word posterior probability.

Full Paper

Bibliographic reference.  Schlüter, Ralf / Wessel, Frank / Ney, Hermann (2000): "Speech recognition using context conditional word posterior probabilities", In ICSLP-2000, vol.2, 923-926.