ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Joint Training of Expanded End-to-End DNN for Text-Dependent Speaker Verification

Hee-soo Heo, Jee-weon Jung, IL-ho Yang, Sung-hyun Yoon, Ha-jin Yu

We propose an expanded end-to-end DNN architecture for speaker verification based on b-vectors as well as d-vectors. We embedded the components of a speaker verification system such as modeling frame-level features, extracting utterance-level features, dimensionality reduction of utterance-level features, and trial-level scoring in an expanded end-to-end DNN architecture. The main contribution of this paper is that, instead of using DNNs as parts of the system trained independently, we train the whole system jointly with a fine-tune cost after pre-training each part. The experimental results show that the proposed system outperforms the baseline d-vector system and i-vector PLDA system.


doi: 10.21437/Interspeech.2017-1050

Cite as: Heo, H.-s., Jung, J.-w., Yang, I.-h., Yoon, S.-h., Yu, H.-j. (2017) Joint Training of Expanded End-to-End DNN for Text-Dependent Speaker Verification. Proc. Interspeech 2017, 1532-1536, doi: 10.21437/Interspeech.2017-1050

@inproceedings{heo17_interspeech,
  author={Hee-soo Heo and Jee-weon Jung and IL-ho Yang and Sung-hyun Yoon and Ha-jin Yu},
  title={{Joint Training of Expanded End-to-End DNN for Text-Dependent Speaker Verification}},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1532--1536},
  doi={10.21437/Interspeech.2017-1050}
}