i-Vector Transformation Using a Novel Discriminative Denoising Autoencoder for Noise-Robust Speaker Recognition

Shivangi Mahto, Hitoshi Yamamoto, Takafumi Koshinaka


This paper proposes i-vector transformations using neural networks for achieving noise-robust speaker recognition. A novel discriminative denoising autoencoder (DDAE) is employed on i-vectors to remove additive noise effects. The DDAE is trained to denoise and classify noisy i-vectors simultaneously, making it possible to add discriminability to the denoised i-vectors. Speaker recognition experiments on the NIST SRE 2012 task shows 32% better error performance as compared to a baseline system. Also, our proposed method outperforms such conventional methods as multi-condition training and a basic denoising autoencoder.


 DOI: 10.21437/Interspeech.2017-731

Cite as: Mahto, S., Yamamoto, H., Koshinaka, T. (2017) i-Vector Transformation Using a Novel Discriminative Denoising Autoencoder for Noise-Robust Speaker Recognition. Proc. Interspeech 2017, 3722-3726, DOI: 10.21437/Interspeech.2017-731.


@inproceedings{Mahto2017,
  author={Shivangi Mahto and Hitoshi Yamamoto and Takafumi Koshinaka},
  title={i-Vector Transformation Using a Novel Discriminative Denoising Autoencoder for Noise-Robust Speaker Recognition},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3722--3726},
  doi={10.21437/Interspeech.2017-731},
  url={http://dx.doi.org/10.21437/Interspeech.2017-731}
}