The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
This research aims at developing a real-time HMM-based Malay emotional speech synthesis (EM-HTS) that has the ability to synthesize any form of text input in four different expressions which are neutral, anger, sadness and happiness. The quality of the emotional speech synthesis was improved by using Neutral to Angry, Sad, and Happy (NASH) duration generator; which uses context-dependent duration generation method to improve the duration information to the label files of target emotions for training purposes. We conducted three forms of evaluationb to determine the a ccuracy, intelligibility and naturalness of the speech generated by EM-HTS. All the three test show that the adopted method (NASH) gives a better reproduction of prosody compared to conventionsl method using the same training speech data.
Index Terms: HMM-based emotional speech synthesis, context-dependent duration conversion
Bibliographic reference. Mustafa, Mumtaz B. / Ainon, Raja N. / Zainuddin, Roziati (2010): "EM-HTS: real-time HMM-based Malay emotional speech synthesis", In SSW7-2010, 340-344.