Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function

Jianfeng Zhou, Tao Jiang, Zheng Li, Lin Li, Qingyang Hong


In speaker verification, the convolutional neural networks (CNN) have been successfully leveraged to achieve a great performance. Most of the models based on CNN primarily focus on learning the distinctive speaker embedding from the horizontal direction (time-axis). However, the feature relationship between channels is usually neglected. In this paper, we firstly aim toward an alternate direction of recalibrating the channel-wise features by introducing the recently proposed “squeeze-and-excitation” (SE) module for image classification. We effectively incorporate the SE blocks in the deep residual networks (ResNet-SE) and demonstrate a slightly improvement on VoxCeleb corpuses. Additionally, we propose a new loss function, namely additive supervision softmax (AS-Softmax), to make full use of the prior knowledge of the mis-classified samples at training stage by imposing more penalty on the mis-classified samples to regularize the training process. The experimental results on VoxCeleb corpuses demonstrate that the proposed loss could further improve the performance of speaker system, especially on the case that the combination of the ResNet-SE and the AS-Softmax.


 DOI: 10.21437/Interspeech.2019-1704

Cite as: Zhou, J., Jiang, T., Li, Z., Li, L., Hong, Q. (2019) Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function. Proc. Interspeech 2019, 2883-2887, DOI: 10.21437/Interspeech.2019-1704.


@inproceedings{Zhou2019,
  author={Jianfeng Zhou and Tao Jiang and Zheng Li and Lin Li and Qingyang Hong},
  title={{Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2883--2887},
  doi={10.21437/Interspeech.2019-1704},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1704}
}