Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding

Tom Bäckström, Florin Ghido, Johannes Fischer


A central part of speech and audio codecs are their perceptual models, which describe the relative perceptual importance of errors in different elements of the signal representation. In practice, the perceptual models consists of signal-dependent weighting factors which are used in quantization of each element. For optimal performance, we would like to use the same perceptual model at the decoder. While the perceptual model is signal-dependent, however, it is not known in advance at the decoder, whereby audio codecs generally transmit this model explicitly, at the cost of increased bit-consumption. In this work we present an alternative method which recovers the perceptual model at the decoder from the transmitted signal without any side-information. The approach will be especially useful in distributed sensor-networks and the Internet of things, where the added cost on bit-consumption from transmitting a perceptual model increases with the number of sensors.


DOI: 10.21437/Interspeech.2016-27

Cite as

Bäckström, T., Ghido, F., Fischer, J. (2016) Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding. Proc. Interspeech 2016, 2483-2487.

Bibtex
@inproceedings{Bäckström+2016,
author={Tom Bäckström and Florin Ghido and Johannes Fischer},
title={Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-27},
url={http://dx.doi.org/10.21437/Interspeech.2016-27},
pages={2483--2487}
}