2001: A Speaker Odyssey - The Speaker Recognition Workshop

June 18-22, 2001
Crete, Greece

Formant and F0 Features for Speaker Recognition

Eric G. Hansen (1), Raymond E. Slyh (2), Timothy R. Anderson (2)

(1) Veridian, Dayton, OH, USA
(2) Air Force Research Laboratory, Wright-Patterson AFB, OH, USA

In this paper, the feature set of fundamental frequency, formant center frequencies, and formant bandwidths were used in speaker verification experiments using the database distributed by the Speaker Odyssey Workshop. The features were extracted using the Entropic Signal Processing System. The main classifier was a Gaussian Mixture Model system built by MIT Lincoln Laboratory, but tests were also run using a Vector Quantization classifer for comparison. Different normalization methods were utilized to try to improve results including Hnorm and spectral subtraction. Test results on the Speaker Odyssey database and also on the database used in the NIST 1998 Speaker Recognition Evaluation, are presented on Decision Error Trade-off (DET) curves. Speaker verification accuracy did not improve using these frequency based features, but the Equal Error Rate was within 10% between tests run with the small feature set of frequency based features compared to the standard large set of mel-frequency cepstral coefficients.


Full Paper   Presentation

Bibliographic reference.  Hansen, Eric G. / Slyh, Raymond E. / Anderson, Timothy R. (2001): "Formant and F0 features for speaker recognition", In ODYSSEY-2001, 25-30.