15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Nearest Neighbor Discriminant Analysis for Robust Speaker Recognition

Seyed Omid Sadjadi, Jason Pelecanos, Weizhong Zhu

IBM T.J. Watson Research Center, USA

With the advent of i-vectors, linear discriminant analysis (LDA) has become an integral part of many state-of-the-art speaker recognition systems. Here, LDA is primarily employed to annihilate the non-speaker related (e.g., channel) directions, thereby maximizing the inter-speaker separation. The traditional approach for computing the LDA transform uses parametric representations for both intra- and inter-speaker scatter matrices that are based on the Gaussian distribution assumption. However, it is known that the actual distribution of i-vectors may not necessarily be Gaussian, and in particular, in the presence of noise and channel distortions. Motivated by this observation, we present an alternative non-parametric discriminant analysis (NDA) technique that measures both the within- and between-speaker variation on a local basis using the nearest neighbor rule. The effectiveness of the NDA method is evaluated in the context of noisy speaker recognition tasks using speech material from the DARPA Robust Automatic Transcription of Speech (RATS) program. Experimental results indicate that the NDA is more effective than the traditional parametric LDA for speaker recognition under noisy and channel degraded conditions.

Full Paper

Bibliographic reference.  Sadjadi, Seyed Omid / Pelecanos, Jason / Zhu, Weizhong (2014): "Nearest neighbor discriminant analysis for robust speaker recognition", In INTERSPEECH-2014, 1860-1864.