ISCA Archive SSW 2010
ISCA Archive SSW 2010

A hidden Markov model-based approach for emotional speech synthesis

Chih-Yung Yang, Chia-Ping Chen

In this paper, we describe an approach to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech. The basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a modeldistance measure. In this paper, we propose a monophonebased Mahalanobis distance (MBMD). We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker.

Index Terms: speech synthesis, HMM, emotional expressiveness, Mahalanobis distance, model interpolation


Cite as: Yang, C.-Y., Chen, C.-P. (2010) A hidden Markov model-based approach for emotional speech synthesis. Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7), 126-129

@inproceedings{yang10_ssw,
  author={Chih-Yung Yang and Chia-Ping Chen},
  title={{A hidden Markov model-based approach for emotional speech synthesis}},
  year=2010,
  booktitle={Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7)},
  pages={126--129}
}