5th International Conference on Spoken Language Processing
This paper proposes a simple and efficient time domain technique to estimate an all-pole model on a mel-frequency axis (Mel-LPC), i.e., a bilinear transformed all-pole model by Strube. Autocorrelation coefficients on the mel-frequency axis are exactly derived by computing cross-correlation coefficients between speech signal and all-pass filtered one without any approximation. This method requires only two-fold computational cost as compared to conventional linear prediction analysis. The recognition performance of mel-cepstral parameters obtained by the Mel LPC analysis is compared with those of conventional LP mel-cepstra and the mel-frequency cepstrum coefficients (MFCC) through gender-dependent phoneme and word recognition tests. The results show that the Mel-LPC cepstrum attains a significant improvement in recognition accuracy over conventional LP mel-cepstrum, and gives slightly higher accuracy for male speakers and slightly lower accuracy for female speakers than MFCC.
Bibliographic reference. Matsumoto, Hiroshi / Nakatoh, Yoshihisa / Furuhata, Yoshinori (1998): "An efficient mel-LPC analysis method for speech recognition", In ICSLP-1998, paper 0047.