We use similarities with people we know already as a means to enhance the speaker verification accuracy. Motivated by this, we use cosine distance similarities with a set of reference speakers, cosine distance features (CDF), to improve the performance of speaker verification systems for clean and additive noise test conditions. We used mel frequency cepstral coefficients, power normalized cepstral coefficients, or delta spectral cepstral coefficients for deriving CDF. We then input CDF to a support vector machine (SVM) backend classifier (CDF-SVM). The performance of CDF-SVM was then compared with an i-vector with cosine distance scoring (i-CDS), and an i-vector with a backend SVM classifier (i-SVM) for stationary and non-stationary noises at different signal to noise ratio (SNR) levels. The experimental results show that, the CDF-SVM outperforms all other systems at high SNR and clean environments. However, in certain low SNR cases, i-CDS was found to be better. Finally, we fused the CDF-SVM with i-CDS and results show that the noise robustness of the combined system is significantly better than the individual systems for both high and low SNR levels.
Bibliographic reference. George, Kuruvachan K. / Kumar, C. Santhosh / , Ramachandran K. I. / Panda, Ashish (2015): "Cosine distance features for robust speaker verification", In INTERSPEECH-2015, 234-238.