Fifth ISCA ITRW on Speech Synthesis
June 14-16, 2004
A concatenative speech synthesis system increases its potential to generate natural speech if the system uses more short speech segments, since the concatenation variation becomes greater. In this paper, we propose the use of very short speech segments (5 ms, one pitch period of 200 Hz pitch) for concatenative speech synthesis. The proposed method is applied to the speech database CMU ARCTIC, and 100 sentences synthesized. Though the synthesized speech maintains the speakerís identity and is natural enough, it also has some noises caused by inappropriate unit selection, and the formant changes are awkward in some vowel regions.
Bibliographic reference. Hirai, Toshio / Tenpaku, Seiichi (2004): "Using 5 ms segments in concatenative speech synthesis", In SSW5-2004, 37-42.