EUROSPEECH 2003 - INTERSPEECH 2003
Recently, the performance of speech recognition was drastically improved, and the products with the interface based on speech recognition have been realized. However, when we communicate with computers through a speech interface, misrecognition is inevitable, and it is difficult to recover from it because of the immaturity of the interface. Users try to recover from misrecognition by a repetition of the same content. So, the detection of user's repetition is helpful for a system to detect its misunderstanding, and to recover from the misrecognition. In this paper, we assume the utterance which includes repetitions a correction and propose a method to detect correction utterances in spontaneously spoken dialog using a word spotting based on DTW (dynamic time warping) and N-best hypotheses overlapping measure. As a result, we achieved recall rate of 92.7% and precision of 89.1%. Moreover, we tried to improve recognition accuracy using the detection. Using the choice of vocabulary and grammar setup based on the detection, we achieved improvement in recognition performance from 42.7% to 50.0% for correction utterance and from 70.5% to 77.9% for non-correction utterance.
Bibliographic reference. Kitaoka, Norihide / Kakutani, Naoko / Nakagawa, Seiichi (2003): "Detection and recognition of correction utterance in spontaneously spoken dialog", In EUROSPEECH-2003, 625-628.