Alternative Approaches to Neural Network Based Speaker Verification

Anna Silnova, Lukáš Burget, Jan Černocký


Just like in other areas of automatic speech processing, feature extraction based on bottleneck neural networks was recently found very effective for the speaker verification task. However, better results are usually reported with more complex neural network architectures (e.g. stacked bottlenecks), which are difficult to reproduce. In this work, we experiment with the so called deep features, which are based on a simple feed-forward neural network architecture. We study various forms of applying deep features to i-vector/PDA based speaker verification. With proper settings, better verification performance can be obtained by means of this simple architecture as compared to the more elaborate bottleneck features. Also, we further experiment with multi-task training, where the neural network is trained for both speaker recognition and senone recognition objectives. Results indicate that, with a careful weighting of the two objectives, multi-task training can result in significantly better performing deep features.


 DOI: 10.21437/Interspeech.2017-1062

Cite as: Silnova, A., Burget, L., Černocký, J. (2017) Alternative Approaches to Neural Network Based Speaker Verification. Proc. Interspeech 2017, 1572-1575, DOI: 10.21437/Interspeech.2017-1062.


@inproceedings{Silnova2017,
  author={Anna Silnova and Lukáš Burget and Jan Černocký},
  title={Alternative Approaches to Neural Network Based Speaker Verification},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1572--1575},
  doi={10.21437/Interspeech.2017-1062},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1062}
}