5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

An Efficient Mel-LPC Analysis Method for Speech Recognition

Hiroshi Matsumoto (1), Yoshihisa Nakatoh (2), Yoshinori Furuhata (1)

(1) Dept. of Electrical & Electronic Eng., Faculty of Eng., Shinshu University, Japa
(2) Multimedia Development Center, Matsushita Electric Industrial Co., Ltd., Japan

This paper proposes a simple and efficient time domain technique to estimate an all-pole model on a mel-frequency axis (Mel-LPC), i.e., a bilinear transformed all-pole model by Strube. Autocorrelation coefficients on the mel-frequency axis are exactly derived by computing cross-correlation coefficients between speech signal and all-pass filtered one without any approximation. This method requires only two-fold computational cost as compared to conventional linear prediction analysis. The recognition performance of mel-cepstral parameters obtained by the Mel LPC analysis is compared with those of conventional LP mel-cepstra and the mel-frequency cepstrum coefficients (MFCC) through gender-dependent phoneme and word recognition tests. The results show that the Mel-LPC cepstrum attains a significant improvement in recognition accuracy over conventional LP mel-cepstrum, and gives slightly higher accuracy for male speakers and slightly lower accuracy for female speakers than MFCC.

