Voice Quality: Functions, Analysis and Synthesis

August 27-29, 2003
Geneva, Switzerland

Emotions and Voice Quality: Experiments with Sinusoidal Modeling

Carlo Drioli (1), Graziano Tisato (1), Piero Cosi (1), Fabio Tesser (2)

(1) Laboratory of Phonetics and Dialectology, ISTC-CNR, Institute of Cognitive Sciences and Technology, Italy
(2) Centre for Scientific and Technological Research, ITC-IRST, Italy

Voice quality is recognized to play an important role for the rendering of emotions in verbal communication. In this paper we explore the effectiveness of a sinusoidal modeling processing framework for voice transformations finalized to the analysis and synthesis of emotive speech. A set of acoustic cues is selected to compare the voice quality characteristics of the speech signals on a voice corpus in which different emotions are reproduced. The sinusoidal signal processing tool is used to convert a neutral utterance into emotive utterances. Two different procedures are applied and compared: in the first one, only the alignment of phoneme duration and of pitch contour is performed; the second procedure refines the transformations by using a spectral conversion function. This refinement improves the reproduction of the different voice qualities of the target emotive utterances. The acoustic cues extracted from the transformed utterances are compared to the emotive original utterances, and the properties and quality of the transformation method are discussed.

Full Paper    Presentation (Powerpoint; 5089 KB )

Bibliographic reference.  Drioli, Carlo / Tisato, Graziano / Cosi, Piero / Tesser, Fabio (2003): "Emotions and voice quality: experiments with sinusoidal modeling", In VOQUAL'03, 127-132.