5th International Conference on Spoken Language Processing
We introduce a number of techniques designed to help integrate semantic knowledge with N-gram language models for automatic speech recognition. Our techniques allow us to integrate Latent Semantic Analysis (LSA), a word-similarity algorithm based on word co-occurrence information, with N-gram models. While LSA is good at predicting content words which are coherent with the rest of a text, it is a bad predictor of frequent words, has a low dynamic range, and is inaccurate when combined linearly with N-grams. We show that modifying the dynamic range, applying a per-word confidence metric, and using geometric rather than linear combinations with N-grams produces a more robust language model which has a lower perplexity on a Wall Street Journal test-set than a baseline N-gram model.
Bibliographic reference. Coccaro, Noah / Jurafsky, Daniel (1998): "Towards better integration of semantic predictors in statistical language modeling", In ICSLP-1998, paper 0852.