This paper proposes a low bit rate coding method for speech and audio using a new analysis method named MLPC (Mel-LPC analysis). In MLPC analysis a spectrum envelope is estimated on a mel- or bark-frequency scale, so as to improve the spectral resolution in the low frequency band. This analysis is accomplished with about two-fold increase in computation over the standard LPC analysis. Our coding algorithm using MLPC analysis consists of five key parts: time frequency transformation, inverse filtering by MLPC spectrum envelope, power normalization, perceptual weighting estimation, and multi-stage VQ. In subjective experiments, we have investigated the performance of MLPC analysis, through paired comparison tests between the MLPC analysis and the standard LPC one in inverse filtering. In all bit rates, almost all the listeners feel decoding sounds by the MLPC analysis is superior to the LPC one. Especially in low bit rate, there is a great difference between them.
Cite as: Nakatoh, Y., Norimatsu, T., Low, A.H., Matsumoto, H. (1998) Low bit rate coding for speech and audio using mel linear predictive coding (MLPC) analysis. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 1100, doi: 10.21437/ICSLP.1998-391
@inproceedings{nakatoh98_icslp, author={Yoshihisa Nakatoh and Takeshi Norimatsu and Ah Heng Low and Hiroshi Matsumoto}, title={{Low bit rate coding for speech and audio using mel linear predictive coding (MLPC) analysis}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 1100}, doi={10.21437/ICSLP.1998-391} }