Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection

Ankur T. Patil, Rajul Acharya, Pulikonda Aditya Sai, Hemant A. Patil


Replay attack poses significant threat to Automatic Speaker Verification (ASV) system among various spoofing attacks, as it is easily accessible by low cost and high quality recording and playback devices. This paper presents a novel feature set, i.e., Cochlear Filter Cepstral Coefficient Instantaneous Frequency using Energy Separation Algorithm (CFCCIF-ESA) to develop countermeasure against replay spoofing attacks. Experimental results on ASVspoof 2017 Version 2.0 database reveal that the proposed CFCCIF-ESA performs better than the earlier proposed CFCCIF (using analytic signal generation via Hilbert transform) feature set. This is because ESA uses extremely short window to estimate instantaneous frequency being able to adapt during speech transitions across phonemes. Experiments are performed using Gaussian Mixture Model (GMM) as a classifier. Baseline Constant Q Cepstral Coefficient (CQCC) performs slightly better than CFCCIF-ESA on development set (i.e., 12.47% and 12.98% Equal Error Rate (EER) for CQCC and CFCCIF-ESA, respectively). However, contrasting results on evaluation set (i.e., 18.81% and 14.77% EER for CQCC and CFCCIF-ESA, respectively) indicates that the proposed CFCCIF-ESA gives relatively better performance for unseen attacks in evaluation data. Also, the proposed feature set gives an EER of 11.56% and 13.26% on development and evaluation dataset when fused with state-of-the-art Mel Frequency Cepstral Coefficient (MFCC).


 DOI: 10.21437/Interspeech.2019-2742

Cite as: Patil, A.T., Acharya, R., Sai, P.A., Patil, H.A. (2019) Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection. Proc. Interspeech 2019, 2898-2902, DOI: 10.21437/Interspeech.2019-2742.


@inproceedings{Patil2019,
  author={Ankur T. Patil and Rajul Acharya and Pulikonda Aditya Sai and Hemant A. Patil},
  title={{Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2898--2902},
  doi={10.21437/Interspeech.2019-2742},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2742}
}