Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

On the Influence of Rate, Pitch, and Spectrum on Automatic Speaker Recognition Performance

Thomas F. Quatieri, R. Bob Dunn, Douglas A. Reynolds

M.I.T. Lincoln Laboratory, Lexington, MA, USA

In this paper we study of the influence of speech articulation rate, pitch, and spectrum on a GMVI-based automatic speaker recognition system [2]. Using the high-quality Sinusoidal transformation system [1], these factors are varied in a controlled manner and the effect on recognition performance evaluated. In general, there was found a larger loss in performance using modified speech for female than for male speakers due to greater feature dependence on spectral fine structure with increasing pitch. An important observation in this study is that certain transformations can dramatically alter the aural speaker identifiability of test data with little change in automatic recognition performance, particularly for male speakers. In addition, the influence of these rate, pitch, and spectral factors on recognition performance is important in order to understand the vulnerabilities of speaker recognition systems to speech modified for gaining false acceptance. For this purpose, we also investigate performance behavior when imposter (or target) speech alone is modified.

References

  1. T.F. Quatieri and R.J. McAulay, "Shape-Invariant Time-Scale and Pitch Modification of Speech", IEEE Trans. Acoustics, Speech, and Signal Proccsstng, vol.40, no.3, pp. 497-510, March 1992
  2. D.A. Reynolds, T.F. Quatieri, and R.B. Dunn, "Speaker verification using adapted Gaussian mixture models", Digital Signal Processing, Special Issue: NIST 1999 Speaker Recognition Workshop, Academic Press, vol.10, no.1-3, pp.19-41, January/April/July 2000.


Full Paper

Bibliographic reference.  Quatieri, Thomas F. / Dunn, R. Bob / Reynolds, Douglas A. (2000): "On the influence of rate, pitch, and spectrum on automatic speaker recognition performance", In ICSLP-2000, vol.2, 491-494.