12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Large Vocabulary SOUL Neural Network Language Models

Hai-Son Le, Ilya Oparin, Abdel Messaoudi, Alexandre Allauzen, Jean-Luc Gauvain, François Yvon

LIMSI, France

This paper presents continuation of research on Structured OUtput Layer Neural Network language models (SOUL NNLM) for automatic speech recognition. As SOUL NNLMs allow estimating probabilities for all in-vocabulary words and not only for those pertaining to a limited shortlist, we investigate its performance on a large-vocabulary task. Significant improvements both in perplexity and word error rate over conventional shortlist-based NNLMs are shown on a challenging Arabic GALE task characterized by a recognition vocabulary of about 300k entries. A new training scheme is proposed for SOUL NNLMs that is based on separate training of the out-of-shortlist part of the output layer. It enables using more data at each iteration of a neural network without any considerable slow-down in training and brings additional improvements in speech recognition performance.

Full Paper

Bibliographic reference.  Le, Hai-Son / Oparin, Ilya / Messaoudi, Abdel / Allauzen, Alexandre / Gauvain, Jean-Luc / Yvon, François (2011): "Large vocabulary SOUL neural network language models", In INTERSPEECH-2011, 1469-1472.