ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Speech/laughter classification in meeting audio

Swe Zin Kalayar Khine, Tin Lay Nwe, Haizhou Li

In this paper, harmonicity information is incorporated into acoustic features to detect laughter segments and speech segments. We implement our system using HMM (Hidden Markov Models) classifier trained on Pitch and Harmonic Frequency Scale based subband filters (PHFS). Harmonicity of the signal can be determined by variation of the pitch and harmonics. The cascaded subband filters are used to spread in pitch and harmonicity frequency scale to describe the harmonicity information. The pitch bandwidth of the first layer spans from 80 Hz to 300 Hz and the entire band spans 80 Hz-8 kHz. The experiments are conducted on ICSI meeting corpus (BMR and Bed). We achieve an average error rate of 0.84% for 'BMR' meeting and 3.64% for 'BED' meeting in segment level speech and laughter detection. The results show that the proposed Pitch and Harmonic Frequency Scale (PHFS) based feature is robust and effective.

doi: 10.21437/Interspeech.2008-243

Cite as: Khine, S.Z.K., Nwe, T.L., Li, H. (2008) Speech/laughter classification in meeting audio. Proc. Interspeech 2008, 793-796, doi: 10.21437/Interspeech.2008-243

  author={Swe Zin Kalayar Khine and Tin Lay Nwe and Haizhou Li},
  title={{Speech/laughter classification in meeting audio}},
  booktitle={Proc. Interspeech 2008},