Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Language Models for Spontaneous Speech Recognition: A Bootstrap Method for Learning Phrase Digrams

Egidio Giachin, Paolo Baggia, Giorgio Micca

CSELT - Centro Studi e Laboratori Telecomunicazioni, Torino, Italy

This study refers to the search for language models that are suitable for the recognition of spontaneous speech occurring in task-specific man-machine dialogue systems. Bigrams are an effective means for that purpose, however they only capture constraints between adjacent words. Task-specific training corpora are very expensive to collect and hence they are likely to be insufficient to reliably train trigrams. On the other hand, the type of sentences employed in these tasks are characterized by highly repetitive phrases that do occur enough times to suggest trying to automatically find and model them as if they were individual dictionary elements, so as to favor their recognition. The determination of the word sequences to model is accomplished according to a perplexity minimization criterion, thus it is optimal insofar perplexity is a reliable quality measure for a language model. The procedure is iterative: starting with the original 1-word elements, it finds the pair of words for which the perplexity reduction is higher and connects them into a 2-word element. By cyclically continuing this action it "bootstraps" to longer-span elements, until no more perplexity reduction is obtained. Some variants of the algorithm are discussed and compared. This model produced a more than 20% perplexity reduction over 1-word bigrams, which makes it favorably comparable to trigrams.

Full Paper

Bibliographic reference.  Giachin, Egidio / Baggia, Paolo / Micca, Giorgio (1994): "Language models for spontaneous speech recognition: a bootstrap method for learning phrase digrams", In ICSLP-1994, 843-846.