ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Combining nonlocal, syntactic and n-gram dependencies in language modeling

Jun Wu, Sanjeev Khudanpur

A new language model is presented which incorporates local N-gram dependencies with two important sources of long-range dependencies: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy method. Substantial improvements are demonstrated over a trigram model in both perplexity and speech recognition accuracy on the Switchboard task. It is shown that topic dependencies are most useful in predicting words which are semantically related by the subject matter of the conversation. Syntactic dependencies on the other hand are found to be most helpful in positions where the best predictors of the following word are not within N-gram range due to an intervening phrase or clause. It is also shown that these two methods individually enhance an N-gram model in complementary ways and the overall improvement from their combination is nearly additive.


doi: 10.21437/Eurospeech.1999-482

Cite as: Wu, J., Khudanpur, S. (1999) Combining nonlocal, syntactic and n-gram dependencies in language modeling. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2179-2182, doi: 10.21437/Eurospeech.1999-482

@inproceedings{wu99c_eurospeech,
  author={Jun Wu and Sanjeev Khudanpur},
  title={{Combining nonlocal, syntactic and n-gram dependencies in language modeling}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={2179--2182},
  doi={10.21437/Eurospeech.1999-482}
}