Machine Listening in Multisource Environments (CHiME) 2011

Florence, Italy
September 1, 2011

Sound Event Detection in Multisource Environments Using Source Separation

Toni Heittola (1), Annamaria Mesaros (1), Tuomas Virtanen (1), Antti Eronen (2)

(1) Department of Signal Processing, Tampere University of Technology, Tampere, Finland
(2) Nokia Research Center, Tampere, Finland

This paper proposes a sound event detection system for natural multisource environments, using a sound source separation front-end. The recognizer aims at detecting sound events from various everyday contexts. The audio is preprocessed using non-negative matrix factorization and separated into four individual signals. Each sound event class is represented by a Hidden Markov Model trained using mel frequency cepstral coefficients extracted from the audio. Each separated signal is used individually for feature extraction and then segmentation and classification of sound events using the Viterbi algorithm. The separation allows detection of a maximum of four overlapping events. The proposed system shows a significant increase in event detection accuracy compared to a system able to output a single sequence of events.

Index Terms. sound event detection, sound source separation, non-negative matrix factorization

Full Paper     Slides

Bibliographic reference.  Heittola, Toni / Mesaros, Annamaria / Virtanen, Tuomas / Eronen, Antti (2011): "Sound event detection in multisource environments using source separation", In CHiME-2011, 36-40.