INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various Environments

Akinori Ito (1), Akihito Aiba (1), Masashi Ito (2), Shozo Makino (3)

(1) Tohoku University, Japan
(2) Tohoku Institute of Technology, Japan
(3) Tohoku Bunka Gakuen University, Japan

We have developed a method to automatically detect incidents by detecting abnormal sound events from audio signals recorded in real environments. The proposed method uses the multi-stage Gaussian Mixture Model (GMM), which learns rare sounds using multiple GMMs. In this work, we investigated the relationship between sound environment and detection performance, and found that the performance deteriorates in noisy environments, and that the performance largely depends on the SN ratio of the abnormal sounds. Next, we investigated methods for determining hyperparameters of the multi-stage GMM, which involves intermediate thresholds, numbers of mixtures of GMMs and the detection threshold. The experimental results showed that the combination of percentile-based threshold determination and Bayesian information criterion (BIC)-based mixture determination was most effective. However, when using the automatically-determined parameters, the detection performance deteriorated by up to 20%.

Full Paper

Bibliographic reference.  Ito, Akinori / Aiba, Akihito / Ito, Masashi / Makino, Shozo (2011): "Evaluation of abnormal sound detection using multi-stage GMM in various environments", In INTERSPEECH-2011, 301-304.