Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Gammatone Auditory Filterbank and Independent Component Analysis for Speaker Identification

Yushi Zhang, Waleed H. Abdulla

University of Auckland, New Zealand

Feature extraction is the key procedure when aiming at robust speaker identification. The most commonly used feature extraction techniques work successfully only in clean or matched environments. Accurate speaker identification is made difficult due to a number of factors, with handset/channel mismatch and environmental noise being the most prominent. This paper presents a novel technique which based on Gammatone filterbank (GTF) and independent component analysis (ICA). The presented method first relies on the Gammatone filterbank to emulate the human cochlea frequency resolution. By using ICA, it extracts the dominant components from these frequency banks. The extracted features emphasis the difference in the statistical structures among the speakers, which can model the distribution of the individuals. Compared to the commonly used techniques, such as linear predictive cepstral coefficients (LPCC), Mel-frequency cepstrum coefficients (MFCC) and perceptual linear predictive (PLP), the proposed method is more robust to additive noises and yields higher recognition rate in mismatch environments in a text-independent speaker identification system.

Full Paper

Bibliographic reference.  Zhang, Yushi / Abdulla, Waleed H. (2006): "Gammatone auditory filterbank and independent component analysis for speaker identification", In INTERSPEECH-2006, paper 1354-Wed3CaP.6.