Bayesian HMM Based x-Vector Clustering for Speaker Diarization

Mireia Diez, Lukáš Burget, Shuai Wang, Johan Rohdin, Jan Černocký


This paper presents a simplified version of the previously proposed diarization algorithm based on Bayesian Hidden Markov Models, which uses Variational Bayesian inference for very fast and robust clustering of x-vector (neural network based speaker embeddings). The presented results show that this clustering algorithm provides significant improvements in diarization performance as compared to the previously used Agglomerative Hierarchical Clustering. The output of this system can be further employed as an initialization for a second stage VB diarization system, using frame-wise MFCC features as input, to obtain optimal results.


 DOI: 10.21437/Interspeech.2019-2813

Cite as: Diez, M., Burget, L., Wang, S., Rohdin, J., Černocký, J. (2019) Bayesian HMM Based x-Vector Clustering for Speaker Diarization. Proc. Interspeech 2019, 346-350, DOI: 10.21437/Interspeech.2019-2813.


@inproceedings{Diez2019,
  author={Mireia Diez and Lukáš Burget and Shuai Wang and Johan Rohdin and Jan Černocký},
  title={{Bayesian HMM Based x-Vector Clustering for Speaker Diarization}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={346--350},
  doi={10.21437/Interspeech.2019-2813},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2813}
}