Target Speaker Extraction for Multi-Talker Speaker Verification

Wei Rao, Chenglin Xu, Eng Siong Chng, Haizhou Li


The performance of speaker verification degrades significantly when the test speech is corrupted by interference from non-target speakers. Speaker diarization separates speakers well only if the speakers are not overlapped. However, if multiple talkers speak at the same time, we need a technique to separate the speech in the spectral domain. In this paper, we study a way to extract the target speaker’s speech from an overlapped multi-talker speech. Specifically, given some reference speech samples from the target speaker, the target speaker’s speech is firstly extracted from the overlapped multi-talker speech, then the extracted speech is processed in the speaker verification system. Experimental results show that the proposed approach significantly improves the performance of overlapped multi-talker speaker verification and achieves 64.4% relative EER reduction over the zero-effort baseline.


 DOI: 10.21437/Interspeech.2019-1410

Cite as: Rao, W., Xu, C., Chng, E.S., Li, H. (2019) Target Speaker Extraction for Multi-Talker Speaker Verification. Proc. Interspeech 2019, 1273-1277, DOI: 10.21437/Interspeech.2019-1410.


@inproceedings{Rao2019,
  author={Wei Rao and Chenglin Xu and Eng Siong Chng and Haizhou Li},
  title={{Target Speaker Extraction for Multi-Talker Speaker Verification}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1273--1277},
  doi={10.21437/Interspeech.2019-1410},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1410}
}