Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Analysis of the Degradation of French Vowels Induced By the TD-PSOLA Algorithm, in Text-to-Speech Context

Christophe J. Blouin, Paul C. Bagshaw

France TÚlÚcom R&D, France

In concatenative speech synthesis systems, synthetic speech is obtained by concatenating acoustic units selected from a database of natural speech. The duration and fundamental frequency (F0) of the selected units are usually different from those requested by a prosodic model, and so some prosodic modification must be applied to the units in order to obtain the desired target. TD-PSOLA is an effective and widely used prosodic modification algorithm, but its use can degrade the perceived quality of the synthetic speech signal. This paper focuses on the evaluation of the degradation of French vowels and determines the influence of several parameters through an analysis of variance. The results show that vowels divide into two groups, based on their first formant frequency (F1). Finally, a modification cost function representative of the degradation is derived from the investigation.

Full Paper

Bibliographic reference.  Blouin, Christophe J. / Bagshaw, Paul C. (2000): "Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context", In ICSLP-2000, vol.1, 709-712.