5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Source Controlled Variable Bit-Rate Speech Coder Based On Waveform Interpolation

F. Plante (1), B. M. G. Cheetham (1), D. Marston (2), P. A. Barrett (3)

(1) Dept. Electrical Engineering & Electronics, Liverpool University, UK
(2) Ensigma Ltd, UK
(3) BT Laboratories, UK

This paper describes a source controlled variable bit-rate (SC-VBR) speech coder based on the concept of prototype waveform interpolation. The coder uses a four mode classification : silence, voiced, unvoiced and transition. These modes are detected after the speech has been decomposed into slowly evolving (SEW) and rapidly evolving (REW) waveforms. A voicing activity detection (VAD), the relative level of SEW and REW and the cross-correlation coefficient between characteristic waveform segments are used to make the classification. The encoding of the SEW components is improved using a gender adaptation. In tests using conversational speech, the SC-VBR allows a compression factor of around 3. The VBR coder was evaluated against a fixed rate 4.6kbit/s PWI coder for clean speech and noisy speech and was found to perform better for male speech and for noisy speech.

