INTERSPEECH 2004 - ICSLP
This paper describes an emotional speech synthesis system based on HMMs and related modeling techniques. For concatenative speech synthesis, we require all of the concatenation units that will be used to be recorded beforehand and made available at synthesis time. To adopt this approach for synthesizing the wide variety of human emotions possible in speech, implies that this process should be repeated for every targeted emotion making this task challenging and time consuming. In this paper, we propose an emotional speech synthesis technique based on HMMs, especially for the case where only limited amount of training data is available, directly incorporating subjective evaluation results performed on the training data. Listening results performed on the synthesized speech suggest that the proposed technique helps to improve the emotional content of synthesized speech.
Bibliographic reference. Zen, Heiga / Kitamura, Tadashi / Bulut, Murtaza / Narayanan, Shrikanth / Tsuzuki, Ryosuke / Tokuda, Keiichi (2004): "Constructing emotional speech synthesizers with limited speech database", In INTERSPEECH-2004, 1185-1188.