Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Speaker Identification Using Vector Quantisation with Codeword-Specific Derivative Coding

Michael Wagner, John S. Mason, J. Bruce Millar

Trust Project, Research School of Information Sciences and Engineering, Australian National University, Canberra, Australia

This paper investigates improvements to the vector quantisation (VQ) distortion method of text-independent speaker identification, using a conventional codebook of instantaneous cepstral vectors from each speaker's training data, and one second-level codebook of transitional cepstral vectors for each codeword of the instantaneous codebook. Results on a 20-speaker database of 30 phonetically rich utterances show a reduction of the error rate from 6.5% for a conventional codebook of size 128 to 5.5% for a code-book which contains 16 transitional codewords for each of the 128 instantaneous codewords (128x16). Results on a 20-speaker database of spoken digits show a reduction of error rate from 3.1% for a conventional (128xO)-codebook to 0.9% for a (128x4)-codebook. Alternatively, a constant error rate can be maintained at a reduced number of codeword comparisons using codeword-specific transitional code-books. Results also show that, given a sufficient size of transitional codebook, transitional distortion scores after instantaneous preclassification can be superior to purely instantaneous distortion scores.

Full Paper

Bibliographic reference.  Wagner, Michael / Mason, John S. / Millar, J. Bruce (1995): "Speaker identification using vector quantisation with codeword-specific derivative coding", In EUROSPEECH-1995, 383-386.