European Conference on Speech Technology
Edinburgh, Scotland, UK
High quality speech coding at medium-to-low bit rates is presently one of the major goals in speech research. Stochastic coding represents an important step towards this objective. Yet, the quality of synthetic speech is still not always good enough. A subjectively important part of the distortion may arise from imperfect reproduction of voiced regions, where the harmonic structure is not so well marked in the synthetic as in the original speech signal. Post-processing of synthetic signals using harmonic modelling arises as a natural solution to reduce this distortion. The disadvantages of this method in terms of additional delay, complexity and dependency on high precision pitch detectors can be well counterbalanced by the higher quality of resynthesized speech signals in voiced regions.
Bibliographic reference. Trancoso, Isabel M. / Tribolet, Jose M. (1987): "Harmonic post-processing of speech synthesized by stochastic coders", In ECST-1987, 2181-2184.