Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Using Genetic Algorithms to Weight Acoustic Features for Speaker Recognition

Maider Zamalloa (1), Germán Bordel (1), Luis Javier Rodríguez (1), Mikel Penagarikano (1), Juan Pedro Uribe (2)

(1) Universidad del País Vasco, Spain; (2) IKERLAN, Spain

The Mel-Frequency Cepstral Coefficients (MFCC) are widely accepted as a suitable representation for speaker recognition applications. MFCC are usually augmented with dynamic features, leading to high dimensional representations. The issue arises of whether some of those features are redundant or dependent on other features. Probably, not all of them are equally relevant for speaker recognition. In this work, we explore the potential benefit of weighting acoustic features to improve speaker recognition accuracy. Genetic algorithms (GAs) are used to find the optimal set of weights for a 38-dimensional feature set. To evaluate each set of weights, recognition error is measured over a validation dataset. Naive speaker models are used, based on empirical distributions of vector quantizer labels. Weighting acoustic features yields 24.58% and 14.68% relative error reductions in two series of speaker recognition tests. These results provide evidence that further improvements in speaker recognition performance can be attained by weighting acoustic features. They also validate the use of GAs to search for an optimal set of feature weights.

Full Paper

Bibliographic reference.  Zamalloa, Maider / Bordel, Germán / Rodríguez, Luis Javier / Penagarikano, Mikel / Uribe, Juan Pedro (2006): "Using genetic algorithms to weight acoustic features for speaker recognition", In INTERSPEECH-2006, paper 1240-Tue1CaP.2.