Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Phrase-Based Language Models for Speech Recognition

Hong-Kwang Jeff Kuo, Wolfgang Reichl

Bell Labs, Lucent Technologies, Murray Hill, NJ, USA

Including phrases in the vocabulary list can improve n-gram language models used in speech recognition. In this paper, we report results of automatic extraction of phrases from the training text using frequency, likelihood, and correlation criteria. We show how a language model built from a vocabulary that includes useful phrases can systematically improve language model perplexity in a natural language call-routing task and the 20K-Nov92 Wall Street Journal evaluation. We also discuss the impact of such phrase-based language models on recognition word error rate.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Kuo, Hong-Kwang Jeff / Reichl, Wolfgang (1999): "Phrase-based language models for speech recognition", In EUROSPEECH'99, 1595-1598.