4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Evaluation of a Language Model using a Clustered Model Backoff

John Miller, Fil Alleva

Microsoft Corporation, Redmond, WA, USA

In this paper, we describe and evaluate a language model using word classes automatically generated from a word clustering algorithm. Class based language models have been shown to be effective for rapid adaptation, training on small datasets, and reduced memory usage. In terms of model perplexity, prior work has shown diminished returns for class based language models constructed using very large training sets. This paper describes a method of using a class model as a backoff to a bigram model which produced significant benefits even when trained from a large text corpus. Tests results on the Whisper continuous speech recognition system show that for a given word error rate, the clustered bigram model uses 2/3 fewer parameters compared to a standard bigram model using unigram backoff.

Full Paper

Bibliographic reference.  Miller, John / Alleva, Fil (1996): "Evaluation of a language model using a clustered model backoff", In ICSLP-1996, 390-393.