INTERSPEECH 2006 - ICSLP
The Mel-Frequency Cepstral Coefficients (MFCC) are widely accepted as a suitable representation for speaker recognition applications. MFCC are usually augmented with dynamic features, leading to high dimensional representations. The issue arises of whether some of those features are redundant or dependent on other features. Probably, not all of them are equally relevant for speaker recognition. In this work, we explore the potential benefit of weighting acoustic features to improve speaker recognition accuracy. Genetic algorithms (GAs) are used to find the optimal set of weights for a 38-dimensional feature set. To evaluate each set of weights, recognition error is measured over a validation dataset. Naive speaker models are used, based on empirical distributions of vector quantizer labels. Weighting acoustic features yields 24.58% and 14.68% relative error reductions in two series of speaker recognition tests. These results provide evidence that further improvements in speaker recognition performance can be attained by weighting acoustic features. They also validate the use of GAs to search for an optimal set of feature weights.
Bibliographic reference. Zamalloa, Maider / Bordel, Germán / Rodríguez, Luis Javier / Penagarikano, Mikel / Uribe, Juan Pedro (2006): "Using genetic algorithms to weight acoustic features for speaker recognition", In INTERSPEECH-2006, paper 1240-Tue1CaP.2.