Sixth International Conference on Spoken Language Processing
October 16-20, 2000
Comparison Of MFCC and Pitch Synchronous AM, FM Parameters for Speaker Identification
Hassan Ezzaidi, Jean Rouat
ERMETIS, DSA, Université du Québec à Chicoutimi, Canada
We study robust pitch synchronous parameters that are derived
from envelope and instantaneous frequencies estimated via a bank
of cochlear filters. Closed set Speaker Identification experiments
are performed on the SPIDRE corpus with matched and mismatched
handsets conditions. The recognizer is based on a hybrid
Linear Vector Quantization and Single Layer Perceptron (LVQSLP).
Experiments are reported with different codebook sizes. In
mismatched condition, the Mel Frequency Cepstral Coefficients
(MFCC) yield slightly better rating (68%) than Envelope (58%)
and Instantaneous Frequency (65%) parameters when used independently.
When the MFCC based recognizer is used in conjunction
with the envelope based recognizer, the recognition rate increases
to 80%. We also report identification rates based on two
classes: women and men. In another experiment, listeners were
asked to discriminate speakers on a subset of ten females. We discuss
their performance. We also discuss the potential of the approach
and of judicious combination of the parameters to improve
Speaker Identification Systems.
Ezzaidi, Hassan / Rouat, Jean (2000):
"Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification",
In ICSLP-2000, vol.2, 318-321.