ISCA Archive IberSPEECH 2022
ISCA Archive IberSPEECH 2022

Intelligent Voice Speaker Recognition and Diarization System for IberSpeech 2022 Albayzin Evaluations Speaker Diarization and Identity Assignment Challenge

Roman Shrestha, Cornelius Glackin, Julie Wall, Nigel Cannings

This paper describes the system developed by Intelligent Voice for IberSpeech 2022 Albayzin Evaluations Speaker Diarization and Identity Assignment Challenge (SDIAC). The presented Variational Bayes x-vector Voice Print Extraction (VBxVPE) system is capable of capturing the vocal variations using multiple x-vector representations with two-stage clustering and outlier detection refinement and implements Deep-Encoder Convolutional Autoencoder Denoiser (DE-CADE) network for denoising segments with noise and music for robust speaker recognition and diarization. When evaluated against the Radiotelevision Espanola (RTVE) 2022 evaluation dataset, the system was able to obtain a Diarization Error Rate (DER) of 37.2% for the Speaker Diarization and Identity Assignment task and 44.34% for the Speaker Diarization only tasks.


Cite as: Shrestha, R., Glackin, C., Wall, J., Cannings, N. (2022) Intelligent Voice Speaker Recognition and Diarization System for IberSpeech 2022 Albayzin Evaluations Speaker Diarization and Identity Assignment Challenge. Proc. IberSPEECH 2022, 281-283

@inproceedings{shrestha22_iberspeech,
  author={Roman Shrestha and Cornelius Glackin and Julie Wall and Nigel Cannings},
  title={{Intelligent Voice Speaker Recognition and Diarization System for IberSpeech 2022 Albayzin Evaluations Speaker Diarization and Identity Assignment Challenge}},
  year=2022,
  booktitle={Proc. IberSPEECH 2022},
  pages={281--283}
}