This paper describes a system, Affective Story Teller (AST), as an example of emotionally expressive speech synthesizer. Our technique uses several linguistic resources that recognizes emotions in the input text according to its emotional affinity and assigns appropriate prosodic parameters as well as pitch accents by XML-based tagging to generate a synthesized speech sample. Then the synthesized sample is re-synthesized through TD-PSOLA based pitch manipulation in accordance to emotional connotation. The system employed MARY TTS system to readout a folk tale. The preliminary perceptual test results are encouraging and human judges, by listening to the re-synthesized speech samples of AST, could perceive ”happy”, “sad”, and “fear” emotions much better than compared to when they listened non-affective synthesized speech.
Bibliographic reference. Shaikh, Mostafa Al Masum / Rebordão, Antonio Rui Ferreira / Hirose, Keikichi (2010): "Affective story teller: a TTS system for emotional expressivity", In INTERSPEECH-2010, 518-521.