Sixth European Conference on Speech Communication and Technology
This paper describes a new framework for designing speaker recognition systems based on the discriminative feature extraction (DFE) method. We apply a mel-cepstral estimation technique to the feature extractor in a Gaussian mixture model (GMM)based textindependent speaker identification system. The melcepstral estimation technique uses the secondorder allpass warping function for frequency transformation. We jointly optimize the frequency warping parameters of the feature extractor and the GMM parameters of the classifier based on a minimum classification error (MCE) criterion. Experimental results show that the frequency warped scale after optimization is different from traditional linear/mel scales; moreover, the proposed system outperforms conventional systems trained with the generalized probabilistic descent (GPD) method in which only the classifier is optimized.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Miyajima, Chiyomi / Watanabe, Hideyuki / Kitamura, Tadashi / Katagiri, Shigeru (1999): "Speaker recognition based on discriminative feature extraction - optimization of mel-cepstral features using second-order all-pass warping function", In EUROSPEECH'99, 779-782.