The objective of this work is to study the speaker-specific nature of analytic phase of speech signals. Since computation of analytic phase suffers from phase wrapping problem, we have used its derivative- the instantaneous frequency for feature extraction. The cepstral coefficients extracted from smoothed subband instantaneous frequencies (IFCC) are used as features for speaker verification. The performance of IFCC features is evaluated on NIST-2003 speaker recognition evaluation database and is compared with baseline mel-frequency cepstral coefficients (MFCC). The performance of IFCC features is observed to be comparable with MFCC features in terms of equal error rates and minimum detection cost function values. Different strategies for evaluating the speaker verification performance of IFCC and MFCC are explored and it is found that the evaluation based on cosine similarity delivers better performance than other strategies under consideration.
Bibliographic reference. Vijayan, Karthika / Kumar, Vinay / Murty, K. Sri Rama (2014): "Feature extraction from analytic phase of speech signals for speaker verification", In INTERSPEECH-2014, 1658-1662.