12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

On the Use of Extended Context for HMM-Based Spontaneous Conversational Speech Synthesis

Tomoki Koriyama, Takashi Nose, Takao Kobayashi

Tokyo Institute of Technology, Japan

This paper addresses an issue of prosodic variability of spontaneous speech in HMM-based spontaneous conversational speech synthesis. We propose an extended context set including information peculiar to spontaneous speech derived from the annotation data embedded in a large-scale database of spontaneous Japanese. We show the effectiveness of the newly introduced contexts from the results of objective and subjective evaluation experiments. We also propose stopping criteria for decision-tree clustering to alleviate an over-fitting problem. Experimental results show that the restriction of the size of each leaf node can improve the quality of synthetic speech.

