ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition

Howard Lei, Eduardo Lopez

We’ve examined the speaker discriminative power of mel-, antimeland linear-frequency cepstral coefficients (MFCCs, a-MFCCs and LFCCs) in the nasal, vowel, and non-nasal consonant speech regions. Our inspiration came from the work of Lu and Dang in 2007, who showed that filterbank energies at some frequencies mainly outside the telephone bandwidth possess more speaker discriminative power due to physiological characteristics of speakers, and derived a set of cepstral coefficients that outperformed MFCCs in non-telephone speech. Using telephone speech, we’ve discovered that LFCCs gave 21.5% and 15.0% relative EER improvements over MFCCs in nasal and non-nasal consonant regions, agreeing with our filterbank energy f-ratio analysis. We’ve also found that using only the vowel region with MFCCs gives a 9.1% relative improvement over using all speech. Last, we’ve shown that a-MFCCs are valuable in combination, contributing to a system with 17.3% relative improvement over our baseline.


doi: 10.21437/Interspeech.2009-389

Cite as: Lei, H., Lopez, E. (2009) Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition. Proc. Interspeech 2009, 2323-2326, doi: 10.21437/Interspeech.2009-389

@inproceedings{lei09c_interspeech,
  author={Howard Lei and Eduardo Lopez},
  title={{Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={2323--2326},
  doi={10.21437/Interspeech.2009-389}
}