In this paper, we propose a spectral kurtosis based approach to extract features with a variable frame length and rate for speaker verification. Since the speaker-specific information of features in each frame changes depending upon the characteristics of speech, it is important to determine the appropriate frame length and rate to extract the salient feature frames. In order to distinctively represent the characteristics of vowels and consonants both in time and frequency domains, we introduce a variable frame length and rate (VFLR) method based on spectral kurtosis, which provides a local measure of time-frequency concentration. Experimental results verify that the proposed VFLR method improves the performance of the speaker verification system on the NIST SRE-06 database by 9.725% (relative) compared to the feature extraction method with the fixed length and rate.
Bibliographic reference. Jung, Chi-Sang / Han, Kyu J. / Seo, Hyunson / Narayanan, Shrikanth S. / Kang, Hong-Goo (2010): "A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification", In INTERSPEECH-2010, 2754-2757.