Transfer Learning for Speaker Verification on Short Utterances

Qingyang Hong, Lin Li, Lihong Wan, Jun Zhang, Feng Tong


Short utterance lacks enough discriminative information and its duration variation will propagate uncertainty into a probability linear discriminant analysis (PLDA) classifier. For speaker verification on short utterances, it can be considered as a domain with limited amount of long utterances. Therefore, transfer learning of PLDA can be adopted to learn discriminative information from other domain with a large amount of long utterances. In this paper, we explore the effectiveness of transfer learning based PLDA (TL-PLDA) on the NIST SRE and Switchboard (SWB) corpus. Experimental results showed that it could produce the largest gain of performance compared with the traditional PLDA, especially for short utterances with the duration of 5s and 10s.


DOI: 10.21437/Interspeech.2016-432

Cite as

Hong, Q., Li, L., Wan, L., Zhang, J., Tong, F. (2016) Transfer Learning for Speaker Verification on Short Utterances. Proc. Interspeech 2016, 1848-1852.

Bibtex
@inproceedings{Hong+2016,
author={Qingyang Hong and Lin Li and Lihong Wan and Jun Zhang and Feng Tong},
title={Transfer Learning for Speaker Verification on Short Utterances},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-432},
url={http://dx.doi.org/10.21437/Interspeech.2016-432},
pages={1848--1852}
}