This paper describes a source controlled variable bit-rate (SC-VBR) speech coder based on the concept of prototype waveform interpolation. The coder uses a four mode classification : silence, voiced, unvoiced and transition. These modes are detected after the speech has been decomposed into slowly evolving (SEW) and rapidly evolving (REW) waveforms. A voicing activity detection (VAD), the relative level of SEW and REW and the cross-correlation coefficient between characteristic waveform segments are used to make the classification. The encoding of the SEW components is improved using a gender adaptation. In tests using conversational speech, the SC-VBR allows a compression factor of around 3. The VBR coder was evaluated against a fixed rate 4.6kbit/s PWI coder for clean speech and noisy speech and was found to perform better for male speech and for noisy speech.
Cite as: Plante, F., Cheetham, B.M.G., Marston, D., Barrett, P.A. (1998) Source controlled variable bit-rate speech coder based on waveform interpolation. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0848, doi: 10.21437/ICSLP.1998-395
@inproceedings{plante98_icslp, author={F. Plante and B. M. G. Cheetham and D. Marston and P. A. Barrett}, title={{Source controlled variable bit-rate speech coder based on waveform interpolation}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0848}, doi={10.21437/ICSLP.1998-395} }