This paper describes the work done in the framework of the VIDIVIDEO European project in terms of audio event detection. Our first experiments concerned the detection of non-voice sounds, such as birds, machines, traffic, water and steps. Given the unavailability of a corpus labelled in terms of audio events, we used a relatively small sound effect corpus for training. Our initial experiments with one-against-all SVM classifiers for these 5 classes showed us the feasibility of using this type of data for training, thus avoiding the extremely morose task of manual labelling of a very high number of audio events. Preliminary integration experiments are quite promising.
Bibliographic reference. Trancoso, Isabel / Portelo, Jose / Bugalho, Miguel / Neto, João / Serralheiro, Antonio (2008): "Training audio events detectors with a sound effects corpus", In INTERSPEECH-2008, 2546-2549.