11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Evaluation of Prosodic Contextual Factors for HMM-Based Speech Synthesis

Shuji Yokomizo, Takashi Nose, Takao Kobayashi

Tokyo Institute of Technology, Japan

We explore the effect of prosodic contextual factors for the HMM-based speech synthesis. In a baseline system, a lot of contextual factors are used during the model training, and the cost for parameter tying by context clustering become relatively high compared to that in the speech recognition. We examine the choice of prosodic contexts by objective measures for English and Japanese speech data. The experimental results show that more compact context sets gives also comparable or close performance to the conventional full context.

Full Paper

Bibliographic reference.  Yokomizo, Shuji / Nose, Takashi / Kobayashi, Takao (2010): "Evaluation of prosodic contextual factors for HMM-based speech synthesis", In INTERSPEECH-2010, 430-433.