This paper describes a technique of estimating style expressiveness for an arbitrary speaker's emotional speech. In the proposed technique, the style expressiveness, representing how much the emotions and/or speaking styles affect the acoustic features, is estimated based on multiple-regression hidden semi-Markov model (MRHSMM). In the model training, we first train average voice model using multiple speakers' neutral style speech. Then, the speakerand style-adapted HSMMs are obtained based on linear transformation from the average voice model with a small amount of the target speaker's data. Finally, MRHSMM of the target speaker is obtained using the adapted models. For given input emotional speech, the style expressiveness is estimated based on maximum likelihood criterion. From the experimental results, we show that the estimated value gives good correspondence to the perceptual rating.
Bibliographic reference. Nose, Takashi / Kato, Yoichi / Tachibana, Makoto / Kobayashi, Takao (2008): "An estimation technique of style expressiveness for emotional speech using model adaptation based on multiple-regression HSMM", In INTERSPEECH-2008, 2759-2762.