12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Recurrent Neural Network Based Language Modeling in Meeting Recognition

Stefan Kombrink, Tomáš Mikolov, Martin Karafiát, Lukáš Burget

Brno University of Technology, Czech Republic

We use recurrent neural network (RNN) based language models to improve the BUT English meeting recognizer. On the baseline setup using the original language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and language model adaptation. When n-gram language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings.

Full Paper

Bibliographic reference.  Kombrink, Stefan / Mikolov, Tomáš / Karafiát, Martin / Burget, Lukáš (2011): "Recurrent neural network based language modeling in meeting recognition", In INTERSPEECH-2011, 2877-2880.