ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Cepstral channel normalization techniques for HMM-based speaker verification

Aaron E. Rosenberg, Chin-Hui Lee, Frank K. Soong

Mismatched recording and channel conditions for training sessions and verification trials can lead to serious performance degradations for speaker verification systems. The effect of linear channel distortions can be compensated by subtracting the cepstrum attributable to the distortion from the cepstrum of the observed signal. Three cepstral normalization techniques have been studied to evaluate their effect on performance of a speaker verification system with a telephone network database of connected digit password utterances. The three techniques represent cepstral distortion as a long term cepstral average, short term cepstral average, and as a maximum likelihood estimate of the observed cepstrum with respect to HMM parameters. Overall, verification performance improves 30 to 45% with cepstral normalization over a baseline condition. The greater improvements are obtained for longer utterances. No significant differences in performance are found for the three techniques.


Cite as: Rosenberg, A.E., Lee, C.-H., Soong, F.K. (1994) Cepstral channel normalization techniques for HMM-based speaker verification. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1835-1838

@inproceedings{rosenberg94_icslp,
  author={Aaron E. Rosenberg and Chin-Hui Lee and Frank K. Soong},
  title={{Cepstral channel normalization techniques for HMM-based speaker verification}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1835--1838}
}