ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Using KL-based acoustic models in a large vocabulary recognition task

Guillermo Aradilla, Hervé Bourlard, Mathew Magimai Doss

Posterior probabilities of sub-word units have been shown to be an effective front-end for ASR. However, attempts to model this type of features either do not benefit from modeling context-dependent phonemes, or use an inefficient distribution to estimate the state likelihood. This paper presents a novel acoustic model for posterior features that overcomes these limitations. The proposed model can be seen as a HMM where the score associated with each state is the KL divergence between a distribution characterizing the state and the posterior features from the test utterance. This KL-based acoustic model establishes a framework where other models for posterior features such as hybrid HMM/MLP and discrete HMM can be seen as particular cases. Experiments on the WSJ database show that the KL-based acoustic model can significantly outperform these latter approaches. Moreover, the proposed model can obtain comparable results to complex systems, such as HMM/GMM, using significantly fewer parameters.

doi: 10.21437/Interspeech.2008-110

Cite as: Aradilla, G., Bourlard, H., Doss, M.M. (2008) Using KL-based acoustic models in a large vocabulary recognition task. Proc. Interspeech 2008, 928-931, doi: 10.21437/Interspeech.2008-110

  author={Guillermo Aradilla and Hervé Bourlard and Mathew Magimai Doss},
  title={{Using KL-based acoustic models in a large vocabulary recognition task}},
  booktitle={Proc. Interspeech 2008},