LEAP Diarization System for the Second DIHARD Challenge

Prachi Singh, Harsha Vardhan M.A., Sriram Ganapathy, A. Kanagasundaram


This paper presents the LEAP System, developed for the Second DIHARD diarization Challenge. The evaluation data in the challenge is composed of multi-talker speech in restaurants, doctor-patient conversations, child language acquisition recordings in home environments and audio extracted YouTube videos. The LEAP system is developed using two types of embeddings, one based on i-vector representations and the other one based on x-vector representations. The initial diarization output obtained using agglomerative hierarchical clustering (AHC) done on the probabilistic linear discriminant analysis (PLDA) scores is refined using the Variational-Bayes hidden Markov model (VB-HMM) model. We propose a modified VB-HMM model with posterior scaling which provides significant improvements in the final diarization error rate (DER). We also use a domain compensation on the i-vector features to reduce the mis-match between training and evaluation conditions. N(s)TN(s)TN(s)T Using the proposed approaches, we obtain relative improvements in DER of about 7.1% relative for the best individual system over the DIHARD baseline system and about 13.7% relative for the final system combination on evaluation set. An analysis performed using the proposed posterior scaling method shows that scaling results in improved discrimination among the HMM states in the VB-HMM.


 DOI: 10.21437/Interspeech.2019-2716

Cite as: Singh, P., M.A., H.V., Ganapathy, S., Kanagasundaram, A. (2019) LEAP Diarization System for the Second DIHARD Challenge. Proc. Interspeech 2019, 983-987, DOI: 10.21437/Interspeech.2019-2716.


@inproceedings{Singh2019,
  author={Prachi Singh and Harsha Vardhan M.A. and Sriram Ganapathy and A. Kanagasundaram},
  title={{LEAP Diarization System for the Second DIHARD Challenge}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={983--987},
  doi={10.21437/Interspeech.2019-2716},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2716}
}