Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Speaker Identification and Verification Using Eigenvoices

Olivier Thyes, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua

Panasonic Technologies Inc., Speech Technology Laboratory Santa Barbara, CA, USA

Gaussian Mixture Models (GMMs) have been successfully applied to the tasks of speaker ID and verification when a large amount of enrolment data is available to characterize client speakers. However, there are many applications where it is unreasonable to expect clients to spend this much time training the system. Thus, we have been exploring the performance of various methods when only a sparse amount of enrolment data is available. Under such conditions, the performance of GMMs deteriorates drastically. A possible solution is the "eigenvoice" approach, in which client and test speaker models are confined to a low-dimensional linear subspace obtained previously from a different set of training data. One advantage of the approach is that it does away with the need for impostor models for speaker verification.

After giving a detailed description of the eigenvoice approach, the paper compares the performance of conventional GMMs on speaker ID and verification with that of GMMs obtained by means of the eigenvoice approach. Experimental results are presented to show that conventional GMMs perform better if there are abundant enrolment data, while eigenvoice GMMs perform better if enrolment data are sparse. The paper also gives experimental results for the case where the eigenspace is trained on one database (TIMIT), but client enrolment and testing involve another (YOHO). For this case, we show that performance improves if an environment adaptation technique is applied to the eigenspace. Finally, we discuss priorities for future work.


Full Paper

Bibliographic reference.  Thyes, Olivier / Kuhn, Roland / Nguyen, Patrick / Junqua, Jean-Claude (2000): "Speaker identification and verification using eigenvoices", In ICSLP-2000, vol.2, 242-245.