Machine Listening in Multisource Environments (CHiME) 2011
This paper proposes a sound event detection system for natural multisource environments, using a sound source separation front-end. The recognizer aims at detecting sound events from various everyday contexts. The audio is preprocessed using non-negative matrix factorization and separated into four individual signals. Each sound event class is represented by a Hidden Markov Model trained using mel frequency cepstral coefficients extracted from the audio. Each separated signal is used individually for feature extraction and then segmentation and classification of sound events using the Viterbi algorithm. The separation allows detection of a maximum of four overlapping events. The proposed system shows a significant increase in event detection accuracy compared to a system able to output a single sequence of events.
Index Terms. sound event detection, sound source separation, non-negative matrix factorization
Full Paper Slides
Bibliographic reference. Heittola, Toni / Mesaros, Annamaria / Virtanen, Tuomas / Eronen, Antti (2011): "Sound event detection in multisource environments using source separation", In CHiME-2011, 36-40.