8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

A Style Control Technique for HMM-Based Speech Synthesis

Takashi Masuko, Takao Kobayashi, Keisuke Miyanaga

Tokyo Institute of Technology, Japan

This paper describes an approach to controlling style of synthetic speech in HMM-based speech synthesis. The style is defined as one of speaking styles and emotional expressions in speech. We model each speech synthesis unit by using a context-dependent HMM whose mean vector of the output distribution function is given by a function of a parameter vector called style control vector. We assume that the mean vector is modeled by multiple regression with the style control vector. The multiple regression matrices are estimated by EM-algorithm as well as other model parameters of HMMs. In the synthesis stage, the mean vectors are modified by transforming an arbitrarily given control vector which is associated with a desired style. The results of subjective tests show that we can control styles by choosing the style control vector appropriately.

