This paper proposes a simple and efficient time domain technique to estimate an all-pole model on a mel-frequency axis (Mel-LPC), i.e., a bilinear transformed all-pole model by Strube. Autocorrelation coefficients on the mel-frequency axis are exactly derived by computing cross-correlation coefficients between speech signal and all-pass filtered one without any approximation. This method requires only two-fold computational cost as compared to conventional linear prediction analysis. The recognition performance of mel-cepstral parameters obtained by the Mel LPC analysis is compared with those of conventional LP mel-cepstra and the mel-frequency cepstrum coefficients (MFCC) through gender-dependent phoneme and word recognition tests. The results show that the Mel-LPC cepstrum attains a significant improvement in recognition accuracy over conventional LP mel-cepstrum, and gives slightly higher accuracy for male speakers and slightly lower accuracy for female speakers than MFCC.
Cite as: Matsumoto, H., Nakatoh, Y., Furuhata, Y. (1998) An efficient mel-LPC analysis method for speech recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0047, doi: 10.21437/ICSLP.1998-536
@inproceedings{matsumoto98_icslp, author={Hiroshi Matsumoto and Yoshihisa Nakatoh and Yoshinori Furuhata}, title={{An efficient mel-LPC analysis method for speech recognition}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0047}, doi={10.21437/ICSLP.1998-536} }