INTERSPEECH 2006 - ICSLP
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To overcome this problem, we use multiple regression hidden semi-Markov model (MRHSMM) which has explicit state duration distributions to control phone duration. We show that the duration control is important for style control of synthetic speech from the results of subjective tests. We also compare the proposed technique with another control technique based on model interpolation.
Bibliographic reference. Nose, Takashi / Yamagishi, Junichi / Kobayashi, Takao (2006): "A style control technique for speech synthesis using multiple regression HSMM", In INTERSPEECH-2006, paper 1184-Tue3BuP.8.