ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

An estimation technique of style expressiveness for emotional speech using model adaptation based on multiple-regression HSMM

Takashi Nose, Yoichi Kato, Makoto Tachibana, Takao Kobayashi

This paper describes a technique of estimating style expressiveness for an arbitrary speaker's emotional speech. In the proposed technique, the style expressiveness, representing how much the emotions and/or speaking styles affect the acoustic features, is estimated based on multiple-regression hidden semi-Markov model (MRHSMM). In the model training, we first train average voice model using multiple speakers' neutral style speech. Then, the speakerand style-adapted HSMMs are obtained based on linear transformation from the average voice model with a small amount of the target speaker's data. Finally, MRHSMM of the target speaker is obtained using the adapted models. For given input emotional speech, the style expressiveness is estimated based on maximum likelihood criterion. From the experimental results, we show that the estimated value gives good correspondence to the perceptual rating.


doi: 10.21437/Interspeech.2008-684

Cite as: Nose, T., Kato, Y., Tachibana, M., Kobayashi, T. (2008) An estimation technique of style expressiveness for emotional speech using model adaptation based on multiple-regression HSMM. Proc. Interspeech 2008, 2759-2762, doi: 10.21437/Interspeech.2008-684

@inproceedings{nose08_interspeech,
  author={Takashi Nose and Yoichi Kato and Makoto Tachibana and Takao Kobayashi},
  title={{An estimation technique of style expressiveness for emotional speech using model adaptation based on multiple-regression HSMM}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2759--2762},
  doi={10.21437/Interspeech.2008-684}
}