With long-span neural network language models, considerable improvements have
been obtained in speech recognition. However, it is difficult to apply these
models if the underlying search space is large.
In this paper, we combine previous work on lattice decoding with long short-term memory (LSTM) neural network language models. By adding refined pruning techniques, we are able to reduce the search effort by a factor of three.
Furthermore, we introduce two novel approximations for full lattice rescoring, which opens the potential of lattice-based speech recognition techniques. Compared to 1000-best lists, we find that we can increase the word error rate improvements obtained with LSTMs from 8.2% to 10.7% relative over a state-of-the-art baseline, while the resulting lattices are even considerably smaller. In addition, we investigate the use of LSTMs for Babel Assamese keyword search, obtaining significant improvements of 2.5% relative.
Bibliographic reference. Sundermeyer, Martin / Tüske, Zoltán / Schlüter, Ralf / Ney, Hermann (2014): "Lattice decoding and rescoring with long-Span neural network language models", In INTERSPEECH-2014, 661-665.