9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Training Audio Events Detectors with a Sound Effects Corpus

Isabel Trancoso (1), Jose Portelo (1), Miguel Bugalho (1), João Neto (1), Antonio Serralheiro (2)

(1) INESC-ID/IST, Portugal; (2) INESC-ID/Academia Militar, Portugal

This paper describes the work done in the framework of the VIDIVIDEO European project in terms of audio event detection. Our first experiments concerned the detection of non-voice sounds, such as birds, machines, traffic, water and steps. Given the unavailability of a corpus labelled in terms of audio events, we used a relatively small sound effect corpus for training. Our initial experiments with one-against-all SVM classifiers for these 5 classes showed us the feasibility of using this type of data for training, thus avoiding the extremely morose task of manual labelling of a very high number of audio events. Preliminary integration experiments are quite promising.

Full Paper

Bibliographic reference.  Trancoso, Isabel / Portelo, Jose / Bugalho, Miguel / Neto, João / Serralheiro, Antonio (2008): "Training audio events detectors with a sound effects corpus", In INTERSPEECH-2008, 2546-2549.