Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Bellevue, WA, USA
June 27, 2011

Bayesian Sensing Hidden Markov Models for Speech Recognition

George Saon (1), Jen-Tzung Chien (2)

(1) IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
(2) National Cheng-Kung University, Taiwan

We introduce Bayesian sensing hidden Markov models to represent speech data based on a set of state-dependent basis vectors. By incorporating the prior density of sensing weights, the relevance of a feature vector to different bases is determined by the corresponding precision parameters. The model parameters, consisting of the basis vectors, the precision matrices of the sensing weights and the precision matrices of the reconstruction errors, are jointly estimated by maximizing the likelihood function, which is marginalized over the weights. We derive recursive solutions for the three parameters, which are expressed via maximum a posteriori estimates of the sensing weights.
   This model was fielded in the latest DARPA GALE Arabic Broadcast News transcription evaluation and has shown gains on the evaluation data over state-of-the-art discriminatively trained HMMs with conventional Gaussian mixture models.


Bibliographic reference.  Saon, George / Chien, Jen-Tzung (2011): "Bayesian sensing hidden Markov models for speech recognition", In MLSLP-2011.