9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Nonlinear Mixture Autoregressive Hidden Markov Models for Speech Recognition

Sundar Srinivasan (1), Tao Ma (1), Daniel May (1), Georgios Lazarou (2), Joseph Picone (1)

(1) Mississippi State University, USA; (2) New York City Transit Authority, USA

Gaussian mixture models are a very successful method for modeling the output distribution of a state in a hidden Markov model (HMM). However, this approach is limited by the assumption that the dynamics of speech features are linear and can be modeled with static features and their derivatives. In this paper, a nonlinear mixture autoregressive model is used to model state output distributions (MAR-HMM). Estimation of model parameters is extended to handle vector features. MAR-HMMs are shown to provide superior performance to comparable Gaussian mixture model-based HMMs (GMM-HMM) with lower complexity on two pilot classification tasks.

Full Paper

Bibliographic reference.  Srinivasan, Sundar / Ma, Tao / May, Daniel / Lazarou, Georgios / Picone, Joseph (2008): "Nonlinear mixture autoregressive hidden Markov models for speech recognition", In INTERSPEECH-2008, 960-963.