ISCA Archive ASRIV 1994
ISCA Archive ASRIV 1994

Detecting an imposter in telephone speech

Johan Schalkwyk, Etienne Barnard, Ronald A. Cole, Jeffrey R. Sachs

This paper presents initial results on imposter detection in telephone speech. The imposter detector problem is defined in terms of a real-world security problem. Perceptual studies are then presented. These studies present a good estimate on the difficulty of the task at hand; it is found that humans classify approximately 85.6% of our benchmark utterances correctly. To design an automatic imposter detector, features which elicit speaker differences are studied. A baseline system based only on 20'th order Linear Predictive Coefficients (LPC) classifies 75.0% of the test set correctly. By extracting features only in vowel and semi-vowel regions, i.e. where the all-pole model of the linear predictor is most accurate, the classification performance is increased to 80.0%. Further features such as average energy and median pitch result in a correct classification rate of 83.7%, comparable to the perceptual benchmarks. Results are also presented for Mandarin, Japanese and Spanish.

Cite as: Schalkwyk, J., Barnard, E., Cole, R.A., Sachs, J.R. (1994) Detecting an imposter in telephone speech. Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, 119-122

  author={Johan Schalkwyk and Etienne Barnard and Ronald A. Cole and Jeffrey R. Sachs},
  title={{Detecting an imposter in telephone speech}},
  booktitle={Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification},