In speaker recognition, it is a problem that speech feature varies depending on sentences and time difference. This variation is mainly attributed to the variation of phonetic information and speaker information included in speech data. If these two kinds of information are separated each other, robust speaker recognition will be realized. In this study, we propose a speaker recognition method by separating the phonetic information and speaker information by a subspace method, under the assumption that a space with large within-speaker variance is a "phonetic space" and a space with small within-speaker variance is a "speaker space". We carried out comparative experiments of the proposed method with a conventional method based on GMM in an observation space as well as in a space transformed by LDA. As a result, we could construct a robust speaker model with a few model parameters using a few training data by the proposed method.
Cite as: Nishida, M., Ariki, Y. (2001) Speaker recognition by separating phonetic space and speaker space. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1381-1384, doi: 10.21437/Eurospeech.2001-357
@inproceedings{nishida01_eurospeech, author={M. Nishida and Y. Ariki}, title={{Speaker recognition by separating phonetic space and speaker space}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={1381--1384}, doi={10.21437/Eurospeech.2001-357} }