11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Detecting Novel Objects in Acoustic Scenes Through Classifier Incongruence

Jörg-Hendrik Bach, Jörn Anemüller

Carl von Ossietzky Universität Oldenburg, Germany

In this study, a new generic framework for the detection and interpretation of disagreement (“incongruence”) between different classifiers [15] is applied to the problem of detecting novel acoustic objects in an office environment. Using a general model that detects generic acoustic objects (standing out from a stationary background) and specific models tuned to particular sounds expected in the office, a novel object is detected as an incongruence between the models: the general model detects it as a generic object, but the specific models can not identify it as any of the known office-related sources. The detectors are realized using amplitude modulation spectrogram and RASTA-PLP features with support vector machine classification. Data considered are speech and non-speech sounds embedded in real office background at signal-to-noise ratios (SNR) from +20 dB to -20 dB. Our approach yields approximately 90% hit rate for novel events at 20 dB SNR, 75% at 0 dB and reaches chance level below -10 dB.

Full Paper

Bibliographic reference.  Bach, Jörg-Hendrik / Anemüller, Jörn (2010): "Detecting novel objects in acoustic scenes through classifier incongruence", In INTERSPEECH-2010, 2206-2209.