INTERSPEECH 2004 - ICSLP
This paper describes an approach to controlling style of synthetic speech in HMM-based speech synthesis. The style is defined as one of speaking styles and emotional expressions in speech. We model each speech synthesis unit by using a context-dependent HMM whose mean vector of the output distribution function is given by a function of a parameter vector called style control vector. We assume that the mean vector is modeled by multiple regression with the style control vector. The multiple regression matrices are estimated by EM-algorithm as well as other model parameters of HMMs. In the synthesis stage, the mean vectors are modified by transforming an arbitrarily given control vector which is associated with a desired style. The results of subjective tests show that we can control styles by choosing the style control vector appropriately.
Bibliographic reference. Masuko, Takashi / Kobayashi, Takao / Miyanaga, Keisuke (2004): "A style control technique for HMM-based speech synthesis", In INTERSPEECH-2004, 1437-1440.