8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Neural Network Language Models for Conversational Speech Recognition

Holger Schwenk, Jean-Luc Gauvain


Recently there is increasing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models (LM), the neural network approach tries to limit the data sparseness problem by performing the estimation in a continuous space, allowing by these means smooth interpolations. Therefore this type of LM is interesting for tasks for which only a very limited amount of in-domain training data is available, in particular the modeling of spontaneous speech. In this paper we analyze the generalization behavior of the neural network LM for in-domain training corpora varying from 7M to more than 21M words. In all cases, we observed significant word error reductions with respect to a carefully tuned 4-gram backoff language model in a state of the art conversational speech recognizer for the NIST rich transcriptions evaluations. We also apply ensemble learning methods and discuss their relations with LM interpolation.

Full Paper

Bibliographic reference.  Schwenk, Holger / Gauvain, Jean-Luc (2004): "Neural network language models for conversational speech recognition", In INTERSPEECH-2004, 2253-2256.