ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Hierarchical statistical language models: experiments on in-domain adaptation

Lucian Galescu, James Allen

We introduce a hierarchical statistical language model, represented as a collection of local models plus a general sentence model. We provide an example that mixes a trigram general model and a PFSA local model for the class of decimal numbers, described in terms of sub-word units (graphemes). This model practically extends the vocabulary of the overall model to an infinite size, but still has better performance compared to a word-based model.

Using in-domain language model adaptation experiments, we show that local models can encode enough linguistic information, if well trained, that they may be ported to new language models without re-estimation.


Cite as: Galescu, L., Allen, J. (2000) Hierarchical statistical language models: experiments on in-domain adaptation. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 186-189

@inproceedings{galescu00_icslp,
  author={Lucian Galescu and James Allen},
  title={{Hierarchical statistical language models: experiments on in-domain adaptation}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 186-189}
}