Sixth ISCA Workshop on Speech Synthesis
For constructing a speech synthesis system which can achieve diverse voices, we have been developing a speaker independent approach of HMM-based speech synthesis in which statistical average voice models are adapted to a target speaker using a small amount of speech data. In this paper, we incorporate a high-quality speech vocoding method STRAIGHT and a parameter generation algorithm with global variance into the system for improving quality of synthetic speech. Furthermore, we introduce a feature-space speaker adaptive training algorithm and a gender mixed modeling technique for conducting further normalization of the average voice model. We build an English text-to-speech system using these techniques and show the performance of the system.
Bibliographic reference. Yamagishi, Junichi / Kobayashi, Takao / Renals, Steve / King, Simon / Zen, Heiga / Toda, Tomoki / Tokuda, Keiichi (2007): "Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV", In SSW6-2007, 125-130.