Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization

Srikanth Korse, Tobias Jähnel, Tom Bäckström


Speech and audio codecs model the overall shape of the signal spectrum using envelope models. In speech coding the predominant approach is linear predictive coding, which offers high coding efficiency at the cost of computational complexity and a rigid systems design. Audio codecs are usually based on scale factor bands, whose calculation and coding is simple, but whose coding efficiency is lower than that of linear prediction. In the current work we propose an entropy coding approach for scale factor bands, with the objective of reaching the same coding efficiency as linear prediction, but simultaneously retaining a low computational complexity. The proposed method is based on quantizing the distribution of spectral mass using beta-distributions. Our experiments show that the perceptual quality achieved with the proposed method is similar to that of linear predictive models with the same bit rate, while the design simultaneously allows variable bit-rate coding and can easily be scaled to different sampling rates. The algorithmic complexity of the proposed method is less than one third of traditional multi-stage vector quantization of linear predictive envelopes.


DOI: 10.21437/Interspeech.2016-55

Cite as

Korse, S., Jähnel, T., Bäckström, T. (2016) Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization. Proc. Interspeech 2016, 2543-2547.

Bibtex
@inproceedings{Korse+2016,
author={Srikanth Korse and Tobias Jähnel and Tom Bäckström},
title={Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-55},
url={http://dx.doi.org/10.21437/Interspeech.2016-55},
pages={2543--2547}
}