Character-level Language Modeling with Gated Hierarchical Recurrent Neural Networks

Iksoo Choi, Jinhwan Park, Wonyong Sung


Recurrent neural network (RNN)-based language models are widely used for speech recognition and translation applications. We propose a gated hierarchical recurrent neural network (GHRNN) and apply it to the character-level language modeling. GHRNN consists of multiple RNN units that operate with different time scales and the frequency of operation at each unit is controlled by the learned gates from training data. In our model, GHRNN learns the hierarchical structure of character, sub-word and word. Timing gates are included in the hierarchical connections to control the operating frequency of these units. The performance was measured for Penn Treebank and Wikitext-2 datasets. Experimental results showed lower bit per character (BPC) when compared to simply layered or skip-connected RNN models. Also, when a continuous cache model is added, the BPC of 1.192 is recorded, which is comparable to the state of the art result.


 DOI: 10.21437/Interspeech.2018-1727

Cite as: Choi, I., Park, J., Sung, W. (2018) Character-level Language Modeling with Gated Hierarchical Recurrent Neural Networks. Proc. Interspeech 2018, 411-415, DOI: 10.21437/Interspeech.2018-1727.


@inproceedings{Choi2018,
  author={Iksoo Choi and Jinhwan Park and Wonyong Sung},
  title={Character-level Language Modeling with Gated Hierarchical Recurrent Neural Networks},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={411--415},
  doi={10.21437/Interspeech.2018-1727},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1727}
}