In this work, we integrate phrases or segments of words into class n-gram language models in order to take advantage of two information sources: words and categories. Two different approaches to this kind of models are proposed and formulated. The models were integrated into an Automatic Speech Recognition system and subsequently evaluated in terms of word error rate. The experiments, carried out over two different databases and languages, demonstrate that a language model based on categories composed by phrases can outperform classical class n-gram language models.
Bibliographic reference. Justo, Raquel / Torres, M. Inés (2007): "Phrases in category-based language models for Spanish and basque ASR", In INTERSPEECH-2007, 2377-2380.