ISCA Archive Interspeech 2006
Soundbite detection in broadcast news domain

Sameer Maskey, Julia Hirschberg

In this paper, we present results of a study designed to identify soundbites in Broadcast News. We describe a Conditional Random Field-based model for the detection of these included speech segments uttered by individuals who are interviewed or who are the subject of a news story. Our goal is to identify direct quotations in spoken corpora which can be directly attributable to particular individuals, as well as to associate these soundbites with their speakers. We frame soundbite detection as a binary classification problem in which each turn is categorized either as a soundbite or not. We use lexical, acoustic/prosodic and structural features on a turn level to train a CRF. We performed a 10-fold cross validation experiment in which we obtained an accuracy of 67.4% and an F-measure of 0.566 which is 20.9% and 38.6% higher than a chance baseline.

doi: 10.21437/Interspeech.2006-433

Cite as: Maskey, S., Hirschberg, J. (2006) Soundbite detection in broadcast news domain. Proc. Interspeech 2006, paper 1690-Wed1WeS.5, doi: 10.21437/Interspeech.2006-433

  author={Sameer Maskey and Julia Hirschberg},
  title={{Soundbite detection in broadcast news domain}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1690-Wed1WeS.5},