ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Gradient Regularization for Noise-Robust Speaker Verification

Jianchen Li, Jiqing Han, Hongwei Song

Noise robustness is a challenge for speaker recognition systems. To solve this problem, one of the most common approaches is to joint-train a model by using both clean and noisy utterances. However, the gradients calculated on noisy utterances generally contain speaker-irrelevant noisy components, resulting in overfitting for the seen noisy data and poor generalization for the unseen noisy environments. To alleviate this problem, we propose the gradient regularization method to reduce the speaker-irrelevant noisy components by aligning the gradients among the noisy utterances and their clean counterparts. Specifically, the gradients on noisy utterances are forced to follow the directions of the gradients calculated on their clean counterparts, and the gradients across different types of noisy utterances are also aligned to point in similar directions. Since the noise-related components of the gradients can be reduced by the above alignment, the speaker model can be prevented from encoding irrelevant noisy information. To achieve the gradient regularization goals, a novel sequential inner training strategy is also proposed. Experiments on the VoxCeleb1 dataset indicate that our method achieves the best performance in seen and unseen noisy environments.

doi: 10.21437/Interspeech.2021-1216

Cite as: Li, J., Han, J., Song, H. (2021) Gradient Regularization for Noise-Robust Speaker Verification. Proc. Interspeech 2021, 1074-1078, doi: 10.21437/Interspeech.2021-1216

  author={Jianchen Li and Jiqing Han and Hongwei Song},
  title={{Gradient Regularization for Noise-Robust Speaker Verification}},
  booktitle={Proc. Interspeech 2021},