5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Low Bit Rate Coding for Speech and Audio Using Mel Linear Predictive Coding (MLPC) Analysis

Yoshihisa Nakatoh (1), Takeshi Norimatsu (1), Ah Heng Low (2), Hiroshi Matsumoto (3)

(1) Matsushita Electric Industrial Co., Ltd., Japan
(2) Faculty of Engineering, Shinshu University, Malaysia
(3) Faculty of Engineering, Shinshu University, Japan

This paper proposes a low bit rate coding method for speech and audio using a new analysis method named MLPC (Mel-LPC analysis). In MLPC analysis a spectrum envelope is estimated on a mel- or bark-frequency scale, so as to improve the spectral resolution in the low frequency band. This analysis is accomplished with about two-fold increase in computation over the standard LPC analysis. Our coding algorithm using MLPC analysis consists of five key parts: time frequency transformation, inverse filtering by MLPC spectrum envelope, power normalization, perceptual weighting estimation, and multi-stage VQ. In subjective experiments, we have investigated the performance of MLPC analysis, through paired comparison tests between the MLPC analysis and the standard LPC one in inverse filtering. In all bit rates, almost all the listeners feel decoding sounds by the MLPC analysis is superior to the LPC one. Especially in low bit rate, there is a great difference between them.

