ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Style estimation of speech based on multiple regression hidden semi-Markov model

Takashi Nose, Yoichi Kato, Takao Kobayashi

This paper presents a technique for estimating the degree or intensity of emotional expressions and speaking styles appeared in speech. The key idea is based on a style control technique for speech synthesis using multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse process of the style control. We derive an algorithm for estimating predictor variables of MRHSMM each of which represents a sort of emotion intensity or speaking style variability appeared in acoustic features based on an ML criterion. We also show preliminary experimental results to demonstrate an ability of the proposed technique for synthetic and acted speech samples with emotional expressions and speaking styles.


doi: 10.21437/Interspeech.2007-620

Cite as: Nose, T., Kato, Y., Kobayashi, T. (2007) Style estimation of speech based on multiple regression hidden semi-Markov model. Proc. Interspeech 2007, 2285-2288, doi: 10.21437/Interspeech.2007-620

@inproceedings{nose07_interspeech,
  author={Takashi Nose and Yoichi Kato and Takao Kobayashi},
  title={{Style estimation of speech based on multiple regression hidden semi-Markov model}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={2285--2288},
  doi={10.21437/Interspeech.2007-620}
}