The ESCA Workshop on Speech Synthesis
September 25-28, 1990
This paper describes a synthesis model of formant trajectories at various speaking rates. The model describes the formant trajectories as the summation of temporal functions: a second order delay function which represents vowel-to-vowel transitions, and two first order delay functions which represent the effects of surrounding consonants on the vowel formant trajectories. Using this model, VCV speech samples were synthesized at slow and fast speaking rates, and their intelligibility tested. It was found that this formant model slightly improves the intelligibility of vowels in both speaking rates and that of consonants in the slow rate compared to the speech synthesized by analysis. However, for the consonants in the fast speech, this formant model decreases the intelligibility by 6%.These results suggest that the model works well, although some additional strategies are needed to improve the intelligibility of the consonants especially at fast speaking rates.
Bibliographic reference. Imaizumi, Satoshi / Kiritani, Shigeru (1990): "A generation model of formant trajectory at various speaking rates", In SSW1-1990, 1-4.