16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Detecting Repetitions in Spoken Dialogue Systems Using Phonetic Distances

José Lopes (1), Giampiero Salvi (1), Gabriel Skantze (1), Alberto Abad (2), Joakim Gustafson (1), Fernando Batista (2), Raveesh Meena (1), Isabel Trancoso (2)

(1) KTH, Sweden
(2) INESC-ID Lisboa, Portugal

Repetitions in Spoken Dialogue Systems can be a symptom of problematic communication. Such repetitions are often due to speech recognition errors, which in turn makes it harder to use the output of the speech recognizer to detect repetitions. In this paper, we combine the alignment score obtained using phonetic distances with dialogue-related features to improve repetition detection. To evaluate the method proposed we compare several alignment techniques from edit distance to DTW-based distance, previously used in Spoken-Term detection tasks. We also compare two different methods to compute the phonetic distance: the first one using the phoneme sequence, and the second one using the distance between the phone posterior vectors. Two different datasets were used in this evaluation: a bus-schedule information system (in English) and a call routing system (in Swedish). The results show that approaches using phoneme distances over-perform approaches using Levenshtein distances between ASR outputs for repetition detection.

Bibliographic reference.  Lopes, José / Salvi, Giampiero / Skantze, Gabriel / Abad, Alberto / Gustafson, Joakim / Batista, Fernando / Meena, Raveesh / Trancoso, Isabel (2015): "Detecting repetitions in spoken dialogue systems using phonetic distances", In INTERSPEECH-2015, 1805-1809.