8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A Hybrid Method Oriented to Concatenative Text-to-Speech Synthesis

Ignasi Iriondo, Francesc Alias, Javier Sanchis, Javier Melenchon

Ramon Llull University, Spain

In this paper we present a speech synthesis method for diphone-based text-to-speech systems. Its main goal is to achieve prosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model, which incorporates a novel method to jointly modify pitch and time-scale. Preliminary results show an improvement in the synthetic speech quality when high pitch modification is required.

Full Paper

Bibliographic reference.  Iriondo, Ignasi / Alias, Francesc / Sanchis, Javier / Melenchon, Javier (2003): "A hybrid method oriented to concatenative text-to-speech synthesis", In EUROSPEECH-2003, 2953-2956.