Joint Sound Source Separation and Speaker Recognition

Jeroen Zegers, Hugo Van hamme


Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or non-simultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source separation for simultaneous speech. This paper explains how NMF can be used to jointly solve the two problems in a multichannel speaker recognizer for simultaneous speech. It is shown how state-of-the-art multichannel NMF for blind source separation can be easily extended to incorporate speaker recognition. Experiments on the CHiME corpus show that this method outperforms the sequential approach of first applying source separation, followed by speaker recognition that uses state-of-the-art i-vector techniques.


DOI: 10.21437/Interspeech.2016-773

Cite as

Zegers, J., hamme, H.V. (2016) Joint Sound Source Separation and Speaker Recognition. Proc. Interspeech 2016, 2228-2232.

Bibtex
@inproceedings{Zegers+2016,
author={Jeroen Zegers and Hugo Van hamme},
title={Joint Sound Source Separation and Speaker Recognition},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-773},
url={http://dx.doi.org/10.21437/Interspeech.2016-773},
pages={2228--2232}
}