Turn-Taking Prediction Based on Detection of Transition Relevance Place

Kohei Hara, Koji Inoue, Katsuya Takanashi, Tatsuya Kawahara


We address turn-taking prediction in which spoken dialogue systems predict when to take the conversational floor. In natural conversations, many turn-taking decisions are arbitrary and subjective. In this study, we propose taking into account the concept of the transition relevance place (TRP) for turn-taking prediction. TRP is defined as a timing when the current speaking turn can be completed and other participants are able to take the turn. We conducted annotation of TRP on a human-robot dialogue corpus, ensuring the objectivity of this annotation among annotators. The proposed turn-taking prediction model adopts a two-step approach that detects TRP at first and then predicts a turn-taking event if TRP is detected. Experimental evaluations demonstrate that the proposed model improves the accuracy of turn-taking prediction by incorporating TRP detection.


 DOI: 10.21437/Interspeech.2019-1537

Cite as: Hara, K., Inoue, K., Takanashi, K., Kawahara, T. (2019) Turn-Taking Prediction Based on Detection of Transition Relevance Place. Proc. Interspeech 2019, 4170-4174, DOI: 10.21437/Interspeech.2019-1537.


@inproceedings{Hara2019,
  author={Kohei Hara and Koji Inoue and Katsuya Takanashi and Tatsuya Kawahara},
  title={{Turn-Taking Prediction Based on Detection of Transition Relevance Place}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={4170--4174},
  doi={10.21437/Interspeech.2019-1537},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1537}
}