8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Efficient Linear Combination for Distant n-Gram Models

David Langlois, Kamel Smaili, Jean-Paul Haton

LORIA, France

The objective of this paper is to present a large study concerning the use of distant language models. In order to combine efficiently distant and classical models, an adaptation of the back-off principle is made. Also, we show the importance of each part of a history for the prediction. In fact, each sub-history is analyzed in order to estimate its importance in terms of prediction and then a weight is associated to each class of sub-histories. Therefore, the combined models take into account the features of each history's part and not the whole history as made in other works. The contribution of distant n-gram models in terms of perplexity is significant and improves the results by 12.8%. Making the linear combination depending on sub-histories achieves an improvement of 5.3% in comparison to classical linear combination.

Full Paper

Bibliographic reference.  Langlois, David / Smaili, Kamel / Haton, Jean-Paul (2003): "Efficient linear combination for distant n-gram models", In EUROSPEECH-2003, 409-412.