Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Discarding Impossible Events from Statistical Language Models

Armelle Brun, David Langlois, Kamel Smaili, Jean-Paul Haton

LORIA, Vandoeuvre-les-Nancy, France

This paper describes a method for detecting impossible bigrams from a space of V2 bigrams where V is the size of the vocabulary. The idea is to discard all the ungrammatical events which are impossible in a well written text and consequently to expect an improvement of the language model. We expect also, in speech recognition, to reduce the complexity of the search algorithm by making less comparisons. To achieve that, we extract the impossible bigrams by using automatic rules. These rules are based on grammatical classes. The biclass associations which are ungrammatical are detected and all the corresponding bigrams are analyzed and set as possible or impossible events. As, in natural language, grammatical rules can have exceptions, we decided to manage for each of the retrieved rules an exception list.

Full Paper

Bibliographic reference.  Brun, Armelle / Langlois, David / Smaili, Kamel / Haton, Jean-Paul (2000): "Discarding impossible events from statistical language models", In ICSLP-2000, vol.3, 981-984.