13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform

Jonathan Dennis (1,2), Huy Dat Tran (1), Eng Siong Chng (2)

(1) Institute for Infocomm Research, A*STAR, Singapore
(2) School of Computer Engineering, Nanyang Technological University, Singapore

We present a novel approach for recognition of overlapping sound events based on the Generalised Hough Transform (GHT) - a technique commonly used for object recognition in the domain of image processing. Unlike our previous work on image-based sound event classification, where we focussed on global image features, here we extract local features from detected interest-points in the spectrogram. These form a robust representation of the local region, and when the information from all interest-points in the spectrogram are combined using the GHT, we can form hypotheses for the location of one or more overlapping sound events in the image. Our experiments show promising results, and demonstrate the ability of our approach to recognise overlapping sounds.

Index Terms: overlapping, sound events, recognition

