Second ESCA/IEEE Workshop on Speech Synthesis
September 12-15, 1994
A general method which combines formant synthesis by rule and time-domain concatenation is investigated. The method aims to keep the advantages of both techniques, while at the same time minimizing difficulties such as prosodic modification and spectral discontinuities at the points of concatenation. We have integrated sampled natural glottal source  and sampled voiceless consonants into a real- time text-to-speech formant synthesizer. Also we have incorporated, in special cases, voicing amplitude envelopes and formant transitions derived from natural speech. Several listening tests were performed to evaluate these methods. The initial results are very promising. As found for Japanese , we obtained a significant overall improvement in intelligibility over our previous formant synthesizer. Also the results of subjective analysis show that these methods can improve naturalness and listenability factors.
Bibliographic reference. Pearson, Steve / Moran, Heather / Hata, Kazue / Holm, Frode (1994): "Combining concatenation and formant synthesis for improved intelligibility and naturalness in text-to-speech systems", In SSW2-1994, 69-72.