ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

A comparative study on speech summarization of broadcast news and lecture speech

Jian Zhang, Ho Yin Chan, Pascale Fung, Lu Cao

We carry out a comprehensive study of acoustic/prosodic, linguistic and structural features for speech summarization, contrasting two genres of speech, namely Broadcast News and Lecture Speech. We find that acoustic and structural features are more important for Broadcast News summarization due to the speaking styles of anchors and reporters, as well as typical news story flow. Due to the relatively small contribution of lexical features, Broadcast News summarization does not depend heavily on ASR accuracies. We use SVM based summarizer to select the best features for extractive summarization, and obtain state-of-the-art performances: ROUGE-L F-measure of 0.64 for Mandarin Broadcast News, and 0.65 for Mandarin Lecture Speech. In the case of Lecture Speech summarization where lexical features are more important, we make the surprising discovery that summarization performance is very high (0.63 ROUGE-L F-measure) even when the ASR accuracy is low (21% CER).


doi: 10.21437/Interspeech.2007-717

Cite as: Zhang, J., Chan, H.Y., Fung, P., Cao, L. (2007) A comparative study on speech summarization of broadcast news and lecture speech. Proc. Interspeech 2007, 2781-2784, doi: 10.21437/Interspeech.2007-717

@inproceedings{zhang07e_interspeech,
  author={Jian Zhang and Ho Yin Chan and Pascale Fung and Lu Cao},
  title={{A comparative study on speech summarization of broadcast news and lecture speech}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={2781--2784},
  doi={10.21437/Interspeech.2007-717}
}