11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Automatic Turn Segmentation in Spoken Conversations

Alexei V. Ivanov, Giuseppe Riccardi

Università di Trento, Italy

In this paper we've studied the problem of finding the spoken turn boundaries in human-to-human telephone conversations. This task is essential to enable the optimal operational conditions for automated speech recognition of dialogs. The problem formulation is different from the conventional voice activity detection and dialog diarization. We have explored applicability of various algorithms for this task and have found that a hidden Markov model combining results of the modulation spectrum analysis and Kullback-Leibler divergence of adjacent signal portions produces the best predictions. The performance of the algorithms was evaluated on realistic conversational data taken from Switchboard corpus.

Full Paper

Bibliographic reference.  Ivanov, Alexei V. / Riccardi, Giuseppe (2010): "Automatic turn segmentation in spoken conversations", In INTERSPEECH-2010, 3130-3133.