A novel perceptual postfilter is introduced. For each frame, the filter gains, z, are estimated given a vector, y, of the quantized LSFs and the long-term prediction gain of the corresponding frame. The proposed perceptual postfilter is derived from an optimal MMSE estimator, i.e. the estimated gain vector is z = E{z|y}. The MMSE estimator is based on the conditional pdf of z given y, which is computed from the joint pdf modelled by a GMM. The proposed perceptual postfilter improves the speech naturalness comparing with the conventional adaptive postfilter, while maintaining the property of being an "add-on" postfilter without modification to the current encoder.
Cite as: Chen, W., Kabal, P., Shabestary, T.Z. (2005) Perceptual postfilter estimation for low bit rate speech coders using Gaussian mixture models. Proc. Interspeech 2005, 3161-3164, doi: 10.21437/Interspeech.2005-841
@inproceedings{chen05j_interspeech, author={Wei Chen and Peter Kabal and Turaj Z. Shabestary}, title={{Perceptual postfilter estimation for low bit rate speech coders using Gaussian mixture models}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={3161--3164}, doi={10.21437/Interspeech.2005-841} }