ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Combining multiple-type input units using recurrent neural network for LVCSR language modeling

Vataya Chunwijitra, Ananlada Chotimongkol, Chai Wutiwiwatchai

In this paper, we investigate the use of a Recurrent Neural Network (RNN) in combining hybrid input types, namely word and pseudo-morpheme (PM) for Thai LVCSR language modeling. Similar to other neural network frameworks, there is no restriction on RNN input types. To exploit this advantage, the input vector of a proposed hybrid RNN language model (RNNLM) is a concatenated vector of word and PM vectors. After the first-pass decoding with an n-gram LM, a word-based lattice is expanded to include the corresponding PMs of each word. The hybrid RNNLM is then used to re-score the hybrid lattice in the second-pass decoding. We tested our hybrid RNNLM on two recognition tasks: broadcast news transcription and mobile speech-to-speech translation. The proposed model achieved better recognition performance than a baseline word-based RNNLM as hybrid input types provide more flexible unit choices for language model re-scoring. The computational complexity of a full-hybrid RNNLM can be reduced by limiting the input vector to include only frequent words and PMs. In a reduced-hybrid RNNLM, the size of the input vector can be reduced by half which can considerably save both training and decoding time without affecting recognition accuracy.


doi: 10.21437/Interspeech.2015-516

Cite as: Chunwijitra, V., Chotimongkol, A., Wutiwiwatchai, C. (2015) Combining multiple-type input units using recurrent neural network for LVCSR language modeling. Proc. Interspeech 2015, 2385-2389, doi: 10.21437/Interspeech.2015-516

@inproceedings{chunwijitra15_interspeech,
  author={Vataya Chunwijitra and Ananlada Chotimongkol and Chai Wutiwiwatchai},
  title={{Combining multiple-type input units using recurrent neural network for LVCSR language modeling}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={2385--2389},
  doi={10.21437/Interspeech.2015-516}
}