8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Stem-Based Maximum Entropy Language Models for Inflectional Languages

Dimitrios Oikonomidis, Vassilios Digalakis

Technical University of Crete, Greece

In this work we build language models using three different training methods: n-gram, class-based and maximum entropy models. The main issue is the use of stem information to cope with the very large number of distinct words of an inflectional language, like Greek. We compare the three models with both perplexity and word error rate. We also examine thoroughly the perplexity differences of the three models on specific subsets of words.

Full Paper

Bibliographic reference.  Oikonomidis, Dimitrios / Digalakis, Vassilios (2003): "Stem-based maximum entropy language models for inflectional languages", In EUROSPEECH-2003, 2285-2288.