International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

Compact Speech Features Based on Wavelet Transform and PCA with Application to Speaker Identification

Ching-Tang Hsieh, Eugene Lai, Wan-Chen Chen, You-Chuang Wan

Tamkang University, Taipei, Taiwan

The main goal of this paper is to find some effective methods to improve the performance of speaker identification system. In speaker identification, we use wavelet transform to decompose the speech signals into several frequency bands and then use cepstral coefficients to capture the individualities of vocal track within the interested bands based on the acoustic characteristic of human ear. In addition, an adaptive wavelet-based filtering mechanism is applied to eliminate the small variation of wavelet coefficients caused by noise. In order to effectively utilize all these multi-band speech features, we propose a modified vector quantization method called multi-layer eigen-codebook vector quantization (MLECVQ) as the identifier. This model uses the multi-layer concept to eliminate the interference between the multi-band coefficients and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing more details of phoneme character. Experimental results show that the proposed method is better than the GMM+MFCC model on computational cost and recognition performance under clean and noisy speech data evaluations.


Full Paper

Bibliographic reference.  Hsieh, Ching-Tang / Lai, Eugene / Chen, Wan-Chen / Wan, You-Chuang (2002): "Compact speech features based on wavelet transform and PCA with application to speaker identification", In ISCSLP 2002, paper 92.