ISCA Archive Odyssey 2014
ISCA Archive Odyssey 2014

Modeling Overlapping Speech using Vector Taylor Series

Pranay Dighe, Marc Ferras, Herve Bourlard

Current speaker diarization systems typically fail to successfully assign multiple speakers speaking simultaneously. According to previous studies, overlapping errors account for a large proportion of the total errors in multi-party speech diarization. In this work, we propose a new approach using Vector Taylor Series (VTS) to obtain overlapping speech models assuming individual speaker models are available, e.g. from the diarization output. We extend the VTS framework to use multiple acoustic classes to account for the non-stationarity of corrupting speaker speech. We propose a system using multi-class VTS to detect single-speaker and two-speaker overlapping speech as well as the speakers involved. We show the effectivity of the approach on distant microphone meeting data, especially with the multiclass approach performing at the state-of-the-art.


doi: 10.21437/Odyssey.2014-30

Cite as: Dighe, P., Ferras, M., Bourlard, H. (2014) Modeling Overlapping Speech using Vector Taylor Series. Proc. The Speaker and Language Recognition Workshop (Odyssey 2014), 194-199, doi: 10.21437/Odyssey.2014-30

@inproceedings{dighe14_odyssey,
  author={Pranay Dighe and Marc Ferras and Herve Bourlard},
  title={{Modeling Overlapping Speech using Vector Taylor Series}},
  year=2014,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2014)},
  pages={194--199},
  doi={10.21437/Odyssey.2014-30}
}