ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification

Martigny, Switzerland
April 7-9, 1994

A Comparison of Some Relevant Parametric Representations for Speaker Verification

M. Mehdi Homayounpour (1,2), Gérard Chollet (3,4)

(1) CNRS/URA 1027, Paris, France
(2) Electrical Engineering Dept. of Amirkabir University of Technology (Tehran Polytechnics), Tehran, Iran
(3) IDIAP Research Center, Martigny, Switzerland
(4) TELECOM-Paris, ENST, Dept. Signal, Paris, France

The selection of the best representation of acoustic data is an important task in the design of any speaker verification system. The usual objective in selecting a representation is to enhance those aspects of the signal that contribute significantly to the representation of speaker-dependent information. In this paper, the effectiveness of some spectral representations for speaker verification is evaluated.

The spectral representations considered in our study are: Linear Frequency Cepstrum Coefficients (LFCC), Mel Frequency Cepstrum Coefficients (MFCC), Linear Predictive Cepstrum Coefficients (LPCC), Differences of Adjacent Line Spectrum Pair Frequencies (DALS), Principal Spectral Components (PSC), Orthogonal Partial Correlation Coefficients (OPCC) and the delta (Δ) coefficients derived directly from LPCC and MFCC. A band-pass liftering was done on LPCC coefficients and the resulting weighted LPCC features (BPLPCC) were also considered.

These features were compared on a data base of 11 speakers (6 males and 5 females) who repeated a sequence of words 55 times. Two distance measures were employed in our study. For all cepstral representations and Δ coefficients we employed cepstral and weighed cepstral distance measures. An efficient dynamic time warping method was used to align reference and test data.

PSC was found to have the best performance among all spectral representations mentioned above. Principal component analysis used to obtain PSC seems to be very efficient in extracting the speaker dependent information. Weighting of coefficients by the reciprocal of their variabilities usually leads to good performance in speaker verification. The order of efficiency of studied representations was not the same when coefficients were weighted or not. The cepstrum parameters (LFCC, MFCC, and LPCC) succeed better than DALS in capturing the significant speaker dependent acoustic information when the weighed distance measure was used. A smal increase in performance of LPCC was obtained when LPCC were bandpass littered.

Full Paper

Bibliographic reference.  Homayounpour, M. Mehdi / Chollet, Gérard (1994): "A comparison of some relevant parametric representations for speaker verification", In ASRIV-1994, 185-188.