Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Categorical Understanding Using Statistical Ngram Models

Alexandros Potamianos (1), Giuseppe Riccardi (2), Shrikanth Narayanan (2)

(1) Bell Labs, Lucent Tech., Murray Hill, NJ, USA
(2) AT&T Labs-Research, Florham Park, NJ, USA

In this paper, the speech understanding problem in the context of a spoken dialog system is formalized in a maximum likelihood framework. Word and dialog-state n-grams are used for building categorical understanding and dialog models, respectively. Acoustic confidence scores are incorporated in the understanding formulation. Problems due to data sparseness and out-of-vocabulary words are discussed. Incorporating dialog models reduces relative understanding error rate by 1525%, while acoustic confidence scores achieve a further 10% error reduction for a computer gaming application.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Potamianos, Alexandros / Riccardi, Giuseppe / Narayanan, Shrikanth (1999): "Categorical understanding using statistical ngram models", In EUROSPEECH'99, 2027-2030.