5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Comparison Of Spectral Estimation Techniques For Low Bit-Rate Speech Coding

D. J. Molyneux (1), C. I. Parris (2), X. Q. Sun (3), B. M. G. Cheetham (1)

(1) School of Engineering, University of Manchester, UK
(2) Ensigma Ltd., UK
(3) Voxware Inc., USA

Many low bit-rate speech coders represent the spectral envelope by an all-pole digital filter whose coefficients are calculated by a form of linear prediction (LP) analysis. The lower the bit-rate, the more critical will be the accuracy of the spectral analysis for achieving good quality speech. This paper compares four known techniques: a technique based on cubic spline interpolation, DAP, MVDR, and iterative all-pole modelling. First, the accuracy obtained for artificial and real speech spectra is assessed for each technique by calculating the degree of spectral distortion with reference to the spectral envelope sampled at the pitch-harmonics. Then, each technique is used to characterise the spectral amplitudes generated by a 2.4 kb/s multi-band excitation (MBE) coder. Results show that significantly better spectral accuracy is obtained using DAP. However listening tests on MBE encoded speech indicate that the advantage of DAP over the other techniques is not strongly perceptible.

