11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Affective Story Teller: A TTS System for Emotional Expressivity

Mostafa Al Masum Shaikh, Antonio Rui Ferreira Rebordão, Keikichi Hirose

University of Tokyo, Japan

This paper describes a system, Affective Story Teller (AST), as an example of emotionally expressive speech synthesizer. Our technique uses several linguistic resources that recognizes emotions in the input text according to its emotional affinity and assigns appropriate prosodic parameters as well as pitch accents by XML-based tagging to generate a synthesized speech sample. Then the synthesized sample is re-synthesized through TD-PSOLA based pitch manipulation in accordance to emotional connotation. The system employed MARY TTS system to readout a folk tale. The preliminary perceptual test results are encouraging and human judges, by listening to the re-synthesized speech samples of AST, could perceive ”happy”, “sad”, and “fear” emotions much better than compared to when they listened non-affective synthesized speech.

