The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

A Hidden Markov Model-Based Approach for Emotional Speech Synthesis

Chih-Yung Yang, Chia-Ping Chen

Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan

In this paper, we describe an approach to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech. The basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a modeldistance measure. In this paper, we propose a monophonebased Mahalanobis distance (MBMD). We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker.

Index Terms: speech synthesis, HMM, emotional expressiveness, Mahalanobis distance, model interpolation

Full Paper

Bibliographic reference.  Yang, Chih-Yung / Chen, Chia-Ping (2010): "A hidden Markov model-based approach for emotional speech synthesis", In SSW7-2010, 126-129.