8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Noise Robust Real World Spoken Dialogue System using GMM Based Rejection of Unintended Inputs

Akinobu Lee (1), Keisuke Nakamura (1), Ryuichi Nisimura (2), Hiroshi Saruwatari (1), Kiyohiro Shikano (1)

(1) Nara Institute of Science and Technology, Japan
(2) Wakayama University, Japan

To realize a robust spoken dialogue system for use in a real environment, the robust rejection of unintended inputs such as laughter, coughing, background speech and other noise based on GMM is implemented and examined on the basis of actual utterances. All the triggered inputs to a speech-oriented guidance system from 125 days of field tests in a public space are collected, and the occurrence of unintended inputs is investigated. GMM classifiers for voice categories (adult speech and child speech) and non-voice categories (laughter, coughing and other noises) are trained on the basis of the analysis result. The rejection performance of unintended speech was experimented on actual uncontrolled real inputs, and an EER of 3.32% was achieved by the 5-class GMM, which outperforms simple 2-class (voice / non-voice) GMM. The rejection of background speech using GMM is also investigated.

Full Paper

Bibliographic reference.  Lee, Akinobu / Nakamura, Keisuke / Nisimura, Ryuichi / Saruwatari, Hiroshi / Shikano, Kiyohiro (2004): "Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs", In INTERSPEECH-2004, 173-176.