11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Significance of Pitch Synchronous Analysis for Speaker Recognition Using AANN Models

Sri Harish Reddy Mallidi, Kishore Prahallad, Suryakanth V. Gangashetty, Bayya Yegnanarayana

IIIT Hyderabad, India

For speaker recognition studies, it is necessary to process the speech signal suitably to capture the speaker-specific information. There is complementary speaker-specific information in the excitation source and vocal tract system characteristics. Therefore it is necessary to separate these components, even approximately, from the speech signal. We propose linear prediction (LP) residual and LP coefficients to represent these two components. Analysis is performed in a pitch synchronous manner in order to focus on the significant portion of the speech signal in each glottal cycle, and also to reduce the artifacts of digital signal processing on the extracted features. Finally, the speaker-specific information is captured from the excitation and the vocal tract system components using autoassociative neural networks (AANN) models. We show that the pitch synchronous extraction of information from the residual and vocal tract system bring out the speaker-specific information much better than using the pitch asynchronous analysis as in the traditional block processing using an analysis window of fixed size.

Full Paper

Bibliographic reference.  Reddy Mallidi, Sri Harish / Prahallad, Kishore / Gangashetty, Suryakanth V. / Yegnanarayana, Bayya (2010): "Significance of pitch synchronous analysis for speaker recognition using AANN models", In INTERSPEECH-2010, 669-672.