ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach)

Edmilson S. Morais, Paul Taylor, Fábio Violaro

This paper presents some preliminary methods to apply the Time- Frequency Interpolation technique - TFI [3] to concatenative text-to-speech synthesis. The TFI technique described here is a pitch-synchronous time-frequency approach of the well known Prototype-Waveform Interpolation technique - PWI [2]. The basic concepts of representing the speech signal in the Time-Frequency Domain as well as techniques to perform Time-Scale and Pitch- Scale modifications are described. Using the flexibility of TFI technique to perform spectral smothing, a method was developed to minimize the spectral mismatch at the boundaries of the Synthesis-Units - SUs. The proposed system was evaluated using SUs (Diphones) and prosodic modifications generated by the Festival system [1]. An informal subjective test was performed, between the proposed TFI system and the standard TD-PSOLA system, highligthing the superior quality of the proposed system in comparasion with TD-PSOLA.

s A. Black, P. Taylor, R. Caley. The Festival Speech Synthesis. Avaliable at http://www.cstr.ed.ac.uk/projects/festival.html, 4(5), Sept. 1996. B. Kleijn, K. Paliwal, eds. Speech Coding and Synthesis. Elsevier, Amsterdam, 1998. Y. Shoham. High-quality Speech Coding at 2.4 to 4.0 kbps Based on Time-Frequency Intepolation. IEEE Proc. ICASSP ‘93, II.167-170, April, 1993.


Cite as: Morais, E.S., Taylor, P., Violaro, F. (2000) Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach). Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 387-390

@inproceedings{morais00_icslp,
  author={Edmilson S. Morais and Paul Taylor and Fábio Violaro},
  title={{Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach)}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 387-390}
}