ISCA Archive Odyssey 2022
ISCA Archive Odyssey 2022

Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition

Mohammad MohammadAmini, Driss Matrouf, Jean-Fran├žois Bonastre, Sandipana Dowerah, Romain Serizel, Denis Jouvet

The presence of background noise and reverberation, especially in far distance speech utterances diminishes the performance of speaker recognition systems. This challenge is addressed on different levels from the signal level in the front end to the scoring technique adaptation in the back end. In this paper, two new variants of ResNet-based speaker recognition systems are proposed that make the speaker embedding more robust against additive noise and reverberation. The goal of the proposed systems is to extract x-vectors in noisy environments that are close to their corresponding x-vector in a clean environment. To do so, the speaker embedding network minimizes the speaker classification loss function and the distance between pairs of noisy and clean x-vectors jointly. The experimental results obtained by our systems are compared with the baseline ResNet system. In different situations with real and simulated noises and reverberation conditions, the modified systems outperform the baseline ResNet system. The proposed systems are tested with four evaluation protocols. In the presence of artificial noise and reverberation, we achieved 19% improvement of EER. The main advantage of the proposed systems is their efficiency against real noise and reverberation. In the presence of real noise and reverberation, we achieved 15% improvement of EER.


doi: 10.21437/Odyssey.2022-6

Cite as: MohammadAmini, M., Matrouf, D., Bonastre, J.-F., Dowerah, S., Serizel, R., Jouvet, D. (2022) Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition. Proc. The Speaker and Language Recognition Workshop (Odyssey 2022), 41-46, doi: 10.21437/Odyssey.2022-6

@inproceedings{mohammadamini22_odyssey,
  author={Mohammad MohammadAmini and Driss Matrouf and Jean-Fran├žois Bonastre and Sandipana Dowerah and Romain Serizel and Denis Jouvet},
  title={{Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition}},
  year=2022,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2022)},
  pages={41--46},
  doi={10.21437/Odyssey.2022-6}
}