INTERSPEECH 2004 - ICSLP
An adaptive speech recognizer is a key function in the design of a robust spoken dialogue system. Our research focuses on the human tendency of prosodic alignment to one's conversational partners. A spoken dialogue system might be able to exploit this human tendency to implicitly influence people to manage their speech at the prosodic level in order to accommodate its recognition capabilities. Consequently, this would decrease recognition errors. Prosodic alignment in human-computer interaction has been studied as part of the problems of personality alignment in the context of animated conversational characters. The present study examines human prosodic alignment tendency at more micro level, and explores whether people's speech amplitude and pause length align to those of computer generated voices within a dialogue exchange. We found that people exhibit spontaneous short-term alignment of speech prosody to the slight prosodic changes in a computer's voice within a session, even without the help of animated conversational characters.
Bibliographic reference. Suzuki, Noriko / Katagiri, Yasuhiro (2004): "Alignment of human prosodic patterns for spoken dialogue systems", In INTERSPEECH-2004, 2989-2992.