EUROSPEECH 2003 - INTERSPEECH 2003
We present a generative factor analyzed hidden Markov model (GFA-HMM) for automatic speech recognition. In a traditional HMM, the observation vectors are represented by mixture of Gaussians (MoG) that are dependent on discrete-valued hidden state sequence. The GFA-HMM introduces a hierarchy of continuous-valued latent representation of observation vectors, where latent vectors in one level are acoustic-unit dependent and the latent vectors in a higher level are acoustic-unit independent. An expectation maximization (EM) algorithm is derived for maximum likelihood parameter estimation of the model. The GFA-HMM can achieve a much more compact representation of the intra-frame statistics of observation vectors than traditional HMM. We conducted an experiment to show that the GFA-HMM can achieve better performances over traditional HMM with the same amount of training data but much smaller number of model parameters.
Bibliographic reference. Yao, Kaisheng / Paliwal, Kuldip K. / Lee, Te-Won (2003): "Speech recognition with a generative factor analyzed hidden Markov model", In EUROSPEECH-2003, 849-852.