Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models

Ryo Masumura, Tomohiro Tanaka, Atsushi Ando, Hosana Kamiyama, Takanobu Oba, Satoshi Kobashikawa, Yushi Aono


In this paper, we integrate fully neural network based conversation-context language models (CCLMs) that are suitable for handling multi-turn conversational automatic speech recognition (ASR) tasks, with multiple neural spoken language understanding (SLU) models. A main strength of CCLMs is their capacity to take long-range interactive contexts beyond utterance boundaries into consideration. However, it is hard to optimize the CCLMs so as to fully exploit the long-range interactive contexts because conversation-level training datasets are often limited. In order to mitigate this problem, our key idea is to introduce various SLU models that are developed for spoken dialogue systems into the CCLMs. In our proposed method (which we call “SLU-assisted CCLM”), hierarchical recurrent encoder-decoder based language modeling is extended so as to handle various utterance-level SLU results of preceding utterances in a continuous space. We expect that the SLU models will help the CCLMs to properly understand semantic meanings of long-range interactive contexts and to fully leverage them for estimating a next utterance. Our experiments on contact center dialogue ASR tasks demonstrate that SLU-assisted CCLMs combined with three types of SLU models can yield ASR performance improvements.


 DOI: 10.21437/Interspeech.2019-1534

Cite as: Masumura, R., Tanaka, T., Ando, A., Kamiyama, H., Oba, T., Kobashikawa, S., Aono, Y. (2019) Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models. Proc. Interspeech 2019, 834-838, DOI: 10.21437/Interspeech.2019-1534.


@inproceedings{Masumura2019,
  author={Ryo Masumura and Tomohiro Tanaka and Atsushi Ando and Hosana Kamiyama and Takanobu Oba and Satoshi Kobashikawa and Yushi Aono},
  title={{Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={834--838},
  doi={10.21437/Interspeech.2019-1534},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1534}
}