8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Speech Recognition with a Generative Factor Analyzed Hidden Markov Model

Kaisheng Yao (1), Kuldip K. Paliwal (2), Te-Won Lee (1)

(1) University of California at San Diego, USA
(2) Griffith University, Australia

We present a generative factor analyzed hidden Markov model (GFA-HMM) for automatic speech recognition. In a traditional HMM, the observation vectors are represented by mixture of Gaussians (MoG) that are dependent on discrete-valued hidden state sequence. The GFA-HMM introduces a hierarchy of continuous-valued latent representation of observation vectors, where latent vectors in one level are acoustic-unit dependent and the latent vectors in a higher level are acoustic-unit independent. An expectation maximization (EM) algorithm is derived for maximum likelihood parameter estimation of the model. The GFA-HMM can achieve a much more compact representation of the intra-frame statistics of observation vectors than traditional HMM. We conducted an experiment to show that the GFA-HMM can achieve better performances over traditional HMM with the same amount of training data but much smaller number of model parameters.

Full Paper

Bibliographic reference.  Yao, Kaisheng / Paliwal, Kuldip K. / Lee, Te-Won (2003): "Speech recognition with a generative factor analyzed hidden Markov model", In EUROSPEECH-2003, 849-852.