Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Dependency Modeling with Bayesian Networks in a Voicemail Transcription System

Geoffrey Zweig, Mukund Padmanabhan

IBM T. J. Watson Research Center, Yorktown Hights, NY, USA

In this paper we apply Bayesian networks to the problem of voicemail transcription. We use a Bayesian network system to test a variety of probabilistic models that model acoustic context in addition to phonetic state and acoustic observations. We use a context variable that has the ability to model contextual phenomena that are not implied by the linguistic sequence of phones (e.g. noise level or speech rate). In rescoring experiments, we are able to get a slight gain over a more standard system with a similar number of parameters. We obtained the best performance by conditioning the mixture coefficients on context, thus implementing a state-wise tied mixture system. In an utterance-clustering system, analysis of the learned parameters indicates that the context variable is highly correlated with C0 and C1.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Zweig, Geoffrey / Padmanabhan, Mukund (1999): "Dependency modeling with bayesian networks in a voicemail transcription system", In EUROSPEECH'99, 1135-1138.