INTERSPEECH 2004 - ICSLP
Recently there is increasing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models (LM), the neural network approach tries to limit the data sparseness problem by performing the estimation in a continuous space, allowing by these means smooth interpolations. Therefore this type of LM is interesting for tasks for which only a very limited amount of in-domain training data is available, in particular the modeling of spontaneous speech. In this paper we analyze the generalization behavior of the neural network LM for in-domain training corpora varying from 7M to more than 21M words. In all cases, we observed significant word error reductions with respect to a carefully tuned 4-gram backoff language model in a state of the art conversational speech recognizer for the NIST rich transcriptions evaluations. We also apply ensemble learning methods and discuss their relations with LM interpolation.
Bibliographic reference. Schwenk, Holger / Gauvain, Jean-Luc (2004): "Neural network language models for conversational speech recognition", In INTERSPEECH-2004, 2253-2256.