Many low bit-rate sinusoidal coding techniques, such as sinusoidal transform coding, require a short term spectral envelope to be estimated for each frame of speech. Suitably encoded, this determines the amplitudes of the sinusoids used to model the speech, hi some coders, phase information is also derived from the envelope under the assumption that it represents the gain response of a minimum phase vocal tract transfer function. The phase spectrum, thus derived, is dependent on how the envelope is interpolated between pitch frequency harmonics especially in the vicinity of formants. The use of optimisation to improve the shape of the envelope in such regions and to compensate for the inadequacy of the minimum phase assumption has been shown to be capable of reducing phase discrepancies. Frequency spreading due to pitch frequency variation also distorts the spectral envelope, and can be compensated for. Experiments have shown that the shape of the reconstructed waveform can be made closer to that of the original by spectral envelope modification, and improvement in perceived speech quality is obtained.
Bibliographic reference. Cheetham, B. M. G. / Sun, X. Q. / Wong, W. T. K. (1995): "Spectral envelope estimation for low bit-rate sinusoidal speech coders", In EUROSPEECH-1995, 693-696.