First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

Instantaneous and Transitional Perceptually-Based Features in Speaker Identification

L. Xu, John S. Mason

Department of Electrical and Electronic Engineering, University College, Swansea, UK

A recent comparison of features and distance measures [3] shows the perceptually based linear prediction, PLP, together with the appropriate distance measure to be consistently better than other widely used standard combinations. This paper investigates the PLP-derived cepstra representing the instantaneous spectral information and the time slope of the cepstra representing the transitional spectral information in automatic speaker identification (ASI). The root-power-sum (RPS) distance and the inverse variance (INV) weighted distance are discussed. The experiments relate to a vector quantization (VQ) based digit-independent ASI. The study shows the first 8 coefficients of the PLP features are the most important in distinguishing inter-speaker differences. The RPS distance and the INV weighted distance perform similarly well, and significantly better than the unweighted cepstral distance. Also, The overall advantage of PLP features over the LPC features are demonstrated.

Full Paper

Bibliographic reference.  Xu, L. / Mason, John S. (1989): "Instantaneous and transitional perceptually-based features in speaker identification", In EUROSPEECH-1989, 1271-1274.