Sixth International Conference on Spoken Language Processing (ICSLP 2000)
October 16-20, 2000
Hierarchical Statistical Language Models: Experiments on In-Domain Adaptation
Lucian Galescu, James Allen
University of Rochester, USA
We introduce a hierarchical statistical language model,
represented as a collection of local models plus a general
sentence model. We provide an example that mixes a trigram
general model and a PFSA local model for the class of decimal
numbers, described in terms of sub-word units (graphemes).
This model practically extends the vocabulary of the overall
model to an infinite size, but still has better performance
compared to a word-based model.
Using in-domain language model adaptation experiments, we
show that local models can encode enough linguistic
information, if well trained, that they may be ported to new
language models without re-estimation.
Galescu, Lucian / Allen, James (2000):
"Hierarchical statistical language models: experiments on in-domain adaptation",
In ICSLP-2000, vol.1, 186-189.