Sixth European Conference on Speech Communication and Technology
This paper describes a method for text-to-speech waveform synthesis based on the Analysis-by-Synthesis/Overlap-Add (ABS/OLA) sinusoidal model. This model has been shown in previous work to be a useful framework for pitch and time-scale modification of both speech and music signals. This paper explores some extensions of the original ABS/OLA formulation that attempt to overcome specific artifacts, including a phase dithering approach for unvoiced speech synthesis and an improved pitch modification method that compensates for undesirable energy modulation effects. The implementation of the model within a text-to-speech synthesis (TTS) system is described, and the results of a listener evaluation of the method are discussed.
Full Paper (PDF)
Acoustic Example #1
Acoustic Example #2
Bibliographic reference. Macon, Michael W. / Clements, Mark A. (1999): "An enhanced ABS/OLA sinusoidal model for waveform synthesis in TTS", In EUROSPEECH'99, 2327-2330.