It is known that speaker-specific information is distributed non-uniformly in the frequency domain. Current speaker recognition systems utilize auditory-motivated scales for extracting acoustic features. These scales, however, are not optimised to exploit the spectral distribution of speaker-specific information and hence may not be the optimal choice for speaker recognition. In this paper, we studied the distribution of speaker-specific information in Spectral Centroid Frequency feature, and a non-uniform filter bank is proposed to capture the speaker-specific information effectively. We used F-ratio and Kullback-Leibler (KL) distance to measure distribution of speaker-specific information and we empirically showed that KL distance is better than F-ratio in measuring discriminative ability. The proposed filterbank emphasises the high KL distance regions by allocating more filters in those regions. Experimental results showed a relative EER reduction of 8.8% over the Mel-scale filterbank on NIST2006 SRE database.
Index Terms: speaker recognition, F-ratio, Kullback-Leibler distance, Spectral centroid frequency
Bibliographic reference. Kua, Jia Min Karen / Thiruvaran, Tharmarajah / Ambikairajah, Eliathamby (2012): "A non-uniform filterbank for speaker recognition", In INTERSPEECH-2012, 2274-2277.