ISCA Archive Odyssey 2001
ISCA Archive Odyssey 2001

Formant and F0 features for speaker recognition

Eric G. Hansen, Raymond E. Slyh, Timothy R. Anderson

In this paper, the feature set of fundamental frequency, formant center frequencies, and formant bandwidths were used in speaker verification experiments using the database distributed by the Speaker Odyssey Workshop. The features were extracted using the Entropic Signal Processing System. The main classifier was a Gaussian Mixture Model system built by MIT Lincoln Laboratory, but tests were also run using a Vector Quantization classifer for comparison. Different normalization methods were utilized to try to improve results including Hnorm and spectral subtraction. Test results on the Speaker Odyssey database and also on the database used in the NIST 1998 Speaker Recognition Evaluation, are presented on Decision Error Trade-off (DET) curves. Speaker verification accuracy did not improve using these frequency based features, but the Equal Error Rate was within 10% between tests run with the small feature set of frequency based features compared to the standard large set of mel-frequency cepstral coefficients.

Cite as: Hansen, E.G., Slyh, R.E., Anderson, T.R. (2001) Formant and F0 features for speaker recognition. Proc. The Speaker and Language Recognition Workshop (Odyssey 2001), 25-30

  author={Eric G. Hansen and Raymond E. Slyh and Timothy R. Anderson},
  title={{Formant and F0 features for speaker recognition}},
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2001)},