ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Improved speech activity detection using cross-channel features for recognition of multiparty meetings

Kofi Boakye, Andreas Stolcke

We describe the development of a speech activity detection system using an HMM-based segmenter for automatic speech recognition on individual headset microphones in multispeaker meetings. We look at cross-channel features (energy and correlation based) to incorporate into the segmenter for the purpose of addressing errors related to cross-channel phenomena such as crosstalk. Results demonstrate that these features provide a marked improvement (18% relative) over a baseline system using single-channel features as well as an improvement (8% relative) over our previous solution of separate speech activity detection and cross-channel analysis. In addition, the simple cross-channel energy features are shown to be more robust - and consequently better performing - than the more common correlation-based features.


doi: 10.21437/Interspeech.2006-538

Cite as: Boakye, K., Stolcke, A. (2006) Improved speech activity detection using cross-channel features for recognition of multiparty meetings. Proc. Interspeech 2006, paper 1824-Wed3A1O.3, doi: 10.21437/Interspeech.2006-538

@inproceedings{boakye06_interspeech,
  author={Kofi Boakye and Andreas Stolcke},
  title={{Improved speech activity detection using cross-channel features for recognition of multiparty meetings}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1824-Wed3A1O.3},
  doi={10.21437/Interspeech.2006-538}
}