This paper presents a method to detect acoustic events that can be used to find “hot spots” in podcast programs. We focus on meaningful non-verbal audible reactions which suggest hot spots such as laughter and reactive tokens. In order to detect this kind of short events and segment the counterpart utterances, we need accurate audio segmentation and classification, dealing with various recording environments and background music. Thus, we propose a method for automatically estimating and switching penalty weights for the BIC-based segmentation depending on background environments. Experimental results show significant improvement in detection accuracy by proposed method compared to when using a constant penalty weight.
Bibliographic reference. Sumi, Kouhei / Kawahara, Tatsuya / Ogata, Jun / Goto, Masataka (2009): "Acoustic event detection for spotting “hot spots” in podcasts", In INTERSPEECH-2009, 1143-1146.