Eighth ISCA Workshop on Speech Synthesis
Barcelona, Catalonia, Spain
The generation of synthetic speech with a certain degree of expressiveness has been successful for some particular applications or speaking styles (e.g. emotions). In this context, there is a particular speaking style with subtle speech nuances that may be of great interest for delivering expressive speech: the storytelling style. The purpose of this paper is to define a first step towards developing a storytelling Text-to-Speech (TTS) synthesis system by means of modelling the specific prosodic patterns (pitch, intensity and tempo) of this speaking style. We base our analysis of a tale in Spanish on discourse modes present in storytelling: narrative, descriptive and dialogue. Moreover, we introduce narrative situations (neutral narrative, post-character, decreasing suspense and affective situations) within the narrative mode, which are analysed at the sentence level. After grouping the sentences into modes and narrative situations, we analyse their corresponding prosodic patterns both objectively (via statistical tests) and subjectively (via perceptual test considering resynthesized sentences). The results show that the statistically validated prosodic rules perform equally (or even better) than the original prosody in most sentences. Index Terms: storytelling, prosodic analysis, narrative situations, TTS, Harmonic plus Noise Model
Bibliographic reference. Montaño, Raúl / Alías, Francesc / Ferrer, Josep (2013): "Prosodic analysis of storytelling discourse modes and narrative situations oriented to text-to-speech synthesis", In SSW8, 171-176.