ISCA Archive ISCSLP 2008
ISCA Archive ISCSLP 2008

A Two-stage Multi-feature Integration Approach to Unsupervised Speaker Change Detection in Real-time News Broadcasting

Lei Xie, Guang-Sen Wang

This paper presents a two-stage multi-feature integration approach for unsupervised speaker change detection in real-time news broadcasting. We integrate MFCC and LSP features (i.e. a perceptual feature plus a articulatory feature) in the metric-based potential speaker change detection stage to collect speaker boundary candidates as many as possible. We adopt a weighted Bayesian information criterion (BIC) to integrate boundary decisions from MFCC and LSP features in the speaker boundary confirmation stage. This multi-feature integration strategy makes use of the complementarity between perceptual features and articulatory features to achieve a performance gain. Speaker change detection experiments show that the multifeature integration approach significantly outperforms the individual features with relative improvements of 26% over the LSP-only approach and 6% over the MFCC-only approach. Index Terms— speaker change detection, speaker segmentation, audio segmentation, audio content analysis


Cite as: Xie, L., Wang, G.-S. (2008) A Two-stage Multi-feature Integration Approach to Unsupervised Speaker Change Detection in Real-time News Broadcasting. Proc. International Symposium on Chinese Spoken Language Processing, 350-353

@inproceedings{xie08_iscslp,
  author={Lei Xie and Guang-Sen Wang},
  title={{A Two-stage Multi-feature Integration Approach to Unsupervised Speaker Change Detection in Real-time News Broadcasting}},
  year=2008,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={350--353}
}