The ESCA Workshop on Speech Synthesis

September 25-28, 1990
Autrans, France

Energy and Articulation Rules for Improving Diphone Speech Synthesis

Ph. Depalle, Xavier Rodet, G. Poirot

IRCAM - Dept. Analyse/Synthèse, Paris, France

Diphone speech synthesis-by-rule has recently been substantially ameliorated. However it is based on a rather crude model of continuous speech articulatory movements. To improve the quality of diphone synthesis, it is necessary to introduce more sophisticated rules in order to reduce the gap between synthetic and natural speech. In this article we present some of the procedures that we recently developed to improve our speech synthesis system. Such procedures are easily implemented in our system since we code diphones in terms of peaks of the transfer function (frequency, amplitude and bandwith) instead of LPC parameters. Characteristics of the source are also easy to modify since the source is coded in terms of amplitude and harmonic/noise coefficients in various frequency bands. Several procedures have been implemented and successfully tested. They deal with energy contour, smoothing diphone frontiers, phoneme duration variations, rapid changes o£ filter coefficients and source characteristics.

