ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification
We address two problems related to text-dependent speaker recognition and verification using very short utterances (less than 1 second) both for training and recognition/verification: speaker acoustic models and verification decision thresholds.
The approach to speaker models consists in exploiting speaker-specific acoustic correlations between two sets of parameter vectors relating to the same speaker. A nonlinear vector interpolation technique is used to capture speaker-specific information through least-square-error optimization. To determine an optimum threshold for speaker verification, we studied the minimum risk and minimum error criteria based on Bayes decision rule.
Experiments are based on five utterances of 4 phonemes contained in one sentence. One utterance is used for test and the remaining 4 for training. Evaluated on 72 speakers we obtained 3.9% speaker recognition error rate and 0.45% minimum risk speaker verification total error rate.
Bibliographic reference. Gong, Yifan / Haton, Jean-Paul (1994): "Non-linear interpolation methods for speaker recognition and verification", In ASRIV-1994, 23-26.