ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Multiscale recurrent neural network based language model

Tsuyoshi Morioka, Tomoharu Iwata, Takaaki Hori, Tetsunori Kobayashi

We describe a novel recurrent neural network-based language model (RNNLM) dealing with multiple time-scales of contexts. The RNNLM is now a technical standard in language modeling because it remembers some lengths of contexts. However, the RNNLM can only deal with a single time-scale of a context, regardless of the subsequent words and topic of the spoken utterance, even though the optimal time-scale of the context can vary under such conditions. In contrast, our multiscale RNNLM enables incorporating with sufficient flexibility, and it makes use of various time-scales of contexts simultaneously and with proper weights for predicting the next word. Experimental comparisons carried out in large vocabulary spontaneous speech recognition demonstrate that introducing the multiple time-scales of contexts into the RNNLM yielded improvements over existing RNNLMs in terms of the perplexity and word error rate.


doi: 10.21437/Interspeech.2015-512

Cite as: Morioka, T., Iwata, T., Hori, T., Kobayashi, T. (2015) Multiscale recurrent neural network based language model. Proc. Interspeech 2015, 2366-2370, doi: 10.21437/Interspeech.2015-512

@inproceedings{morioka15_interspeech,
  author={Tsuyoshi Morioka and Tomoharu Iwata and Takaaki Hori and Tetsunori Kobayashi},
  title={{Multiscale recurrent neural network based language model}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={2366--2370},
  doi={10.21437/Interspeech.2015-512}
}