Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Comparison Of MFCC and Pitch Synchronous AM, FM Parameters for Speaker Identification

Hassan Ezzaidi, Jean Rouat

ERMETIS, DSA, Université du Québec à Chicoutimi, Canada

We study robust pitch synchronous parameters that are derived from envelope and instantaneous frequencies estimated via a bank of cochlear filters. Closed set Speaker Identification experiments are performed on the SPIDRE corpus with matched and mismatched handsets conditions. The recognizer is based on a hybrid Linear Vector Quantization and Single Layer Perceptron (LVQSLP). Experiments are reported with different codebook sizes. In mismatched condition, the Mel Frequency Cepstral Coefficients (MFCC) yield slightly better rating (68%) than Envelope (58%) and Instantaneous Frequency (65%) parameters when used independently. When the MFCC based recognizer is used in conjunction with the envelope based recognizer, the recognition rate increases to 80%. We also report identification rates based on two classes: women and men. In another experiment, listeners were asked to discriminate speakers on a subset of ten females. We discuss their performance. We also discuss the potential of the approach and of judicious combination of the parameters to improve Speaker Identification Systems.


Full Paper

Bibliographic reference.  Ezzaidi, Hassan / Rouat, Jean (2000): "Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification", In ICSLP-2000, vol.2, 318-321.