5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Probabilistic Modeling with Bayesian Networks for Automatic Speech Recognition

Geoffrey Zweig (1), Stuart Russell (2)

(1) IBM T.J. Watson Research Center, USA
(2) U.C. Berkeley, USA

This paper describes the application of Bayesian networks to automatic speech recognition (ASR). Bayesian networks enable the construction of probabilistic models in which an arbitrary set of variables can be associated with each speech frame in order to explicitly model factors such as acoustic context, speaking rate, or articulator positions. Once the basic inference machinery is in place, a wide variety of models can be expressed and tested. We have implemented a Bayesian network system for isolated word recognition, and present experimental results on the PhoneBook database. These results indicate that performance improves when the observations are conditioned on an auxiliary variable modeling acoustic/articulatory context. The use of multivalued and multiple context variables further improves recognition accuracy.

Full Paper

Bibliographic reference.  Zweig, Geoffrey / Russell, Stuart (1998): "Probabilistic modeling with Bayesian networks for automatic speech recognition", In ICSLP-1998, paper 0858.