7th International Conference on Spoken Language Processing
September 16-20, 2002
The compact representation of the discrete amplitude spectrum of voiced speech by an all-pole model of the spectral envelope is considered. Based on the properties of the all-pole modeling error, the use of spectrum predistortion for improving the perceptual fit at low model orders is motivated. Warping of the frequency scale before modeling of the spectral envelope of narrowband voiced sounds is investigated by subjective listening and objective measures. It is found that, contrary to what is generally accepted, the improvement in perceived quality brought about by frequency warping actually depends to a large extent on the underlying signal spectrum distribution. An objective distance measure based on partial noise loudness is found to show high correlation with subjective judgements of degradation, indicating that auditory frequency masking plays an important role in determining the perceptual accuracy of the spectrum envelope model.
Bibliographic reference. Patwardhan, Pushkar / Rao, Preeti (2002): "Controlling perceived degradation in spectrum envelope modeling via predistortion", In ICSLP-2002, 1837-1840.