10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Acoustic Event Detection for Spotting “Hot Spots” in Podcasts

Kouhei Sumi (1), Tatsuya Kawahara (1), Jun Ogata (2), Masataka Goto (2)

(1) Kyoto University, Japan
(2) AIST, Japan

This paper presents a method to detect acoustic events that can be used to find “hot spots” in podcast programs. We focus on meaningful non-verbal audible reactions which suggest hot spots such as laughter and reactive tokens. In order to detect this kind of short events and segment the counterpart utterances, we need accurate audio segmentation and classification, dealing with various recording environments and background music. Thus, we propose a method for automatically estimating and switching penalty weights for the BIC-based segmentation depending on background environments. Experimental results show significant improvement in detection accuracy by proposed method compared to when using a constant penalty weight.

Full Paper

Bibliographic reference.  Sumi, Kouhei / Kawahara, Tatsuya / Ogata, Jun / Goto, Masataka (2009): "Acoustic event detection for spotting “hot spots” in podcasts", In INTERSPEECH-2009, 1143-1146.