8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Speech-Nonspeech Discrimination Using the Information Bottleneck Method and Spectro-Temporal Modulation Index

Maria Markaki, Michael Wohlmayr, Yannis Stylianou

University of Crete, Greece

In this work, we adopt an information theoretic approach - the Information Bottleneck method - to extract the relevant spectro-temporal modulations for the task of speech / non-speech discrimination - non-speech events include music, noise and animal vocalizations. A compact representation (a "cluster prototype") is built for each class consisting of the maximally informative features with respect to the classification task. We assess the similarity of a sound to each representative cluster using the spectro-temporal modulation index (STMI) adapted to handle the contribution of different frequency bands. A simple threshold check is then used for discriminating speech from non-speech events. Conducted experiments have shown that the proposed method has low complexity and high accuracy of discrimination in low SNR conditions compared to recently proposed methods for the same task.

Full Paper

Bibliographic reference.  Markaki, Maria / Wohlmayr, Michael / Stylianou, Yannis (2007): "Speech-nonspeech discrimination using the information bottleneck method and spectro-temporal modulation index", In INTERSPEECH-2007, 2913-2916.