ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Structural Bayesian language modeling and adaptation

Sibel Yaman, Jen-Tzung Chien, Chin-Hui Lee

We propose a language modeling and adaptation framework using Bayesian structural maximum a posteriori (SMAP) principle, in which each n-gram event is embedded in a branch of a tree structure. The nodes in the first layer of this tree structure represent the unigrams, and those in the second layer represent the bigrams, and so on. Each node in the tree structure has an associated hyper-parameter representing the information about the prior distribution, and a count representing the number of times the word sequence occurs in the domain-specific data. In general, the hyper-parameters depend on the observation frequency of not only the node event but also its parent node of lower order n-gram event. Our automatic speech recognition experiments using the Wall Street Journal corpus verify that the proposed SMAP language model adaptation achieves a 5.6% relative improvement over maximum likelihood language models obtained with the same training and adaptation data sets.

doi: 10.21437/Interspeech.2007-269

Cite as: Yaman, S., Chien, J.-T., Lee, C.-H. (2007) Structural Bayesian language modeling and adaptation. Proc. Interspeech 2007, 2365-2368, doi: 10.21437/Interspeech.2007-269

  author={Sibel Yaman and Jen-Tzung Chien and Chin-Hui Lee},
  title={{Structural Bayesian language modeling and adaptation}},
  booktitle={Proc. Interspeech 2007},