ISCA Workshop on Plasticity in Speech Perception (PSP2005)

Senate House, London, UK
June 15-17, 2005

Tracing Vocal Expression of Emotion Along the Speech Chain: Do Listeners Perceive What Speakers Feel?

Sonja Biersack, Vera Kempe

University of Stirling, UK

To date, research on the vocal expression of emotion has mainly focused on the relationship between acted emotions and vocal cues on one hand, and vocal cues and perceived emotions on the other hand (Scherer, 2003), implying a close agreement between what is produced and what is perceived. This study examines whether emotional valence reported by speakers and emotional valence perceived by listeners are linked to the same vocal cues. We reassess the congruity between production and perception in the domain of emotion expression by tracing the functionality of vocal cues along the speech chain. Eighty-eight men and 112 women rated their current emotional state using the Brief Mood Introspection Scale (Meyer & Gaschke, 1988), and produced a target sentence which was embedded in a referential communication task. Twenty other participants rated the target sentences for perceived happiness on a scale from 1 to 7. The first interesting result was that reported positive emotion and perceived happiness were higher in women, but were not correlated within the genders. Furthermore, from the target sentences, we obtained measures for pitch, pitch range, speech rate, intensity, shimmer and jitter, as well as the first and second formants. All acoustic measures except speech rate were converted into z-scores for men and women separately to normalise for gender. Correlations between the acoustic measures, reported emotional valence, and perceived happiness showed that positive emotion reported by the speaker was positively correlated with pitch range. Stepwise regression analyses confirmed pitch range as the best predictor of reported positive emotion. Perceived happiness was correlated with higher pitch, faster speech rate, wider pitch range, steeper declination, and lower perturbations. These findings suggest that most vocal cues used by the listener in emotion perception may not necessarily be indicative of the emotional state of the speaker, and support the view that vocal cues, especially cues related to timing and speech rate, are not just epiphenomena of the emotional state of the speaker, but may serve as signals to affect the listener (Owren & Bachorowski, 2003). Still, pitch range seems to be the most reliable indicator of the valence of the speaker's emotion suggesting that it can mediate between experienced and perceived emotions, and may be the vocal cue most closely associated with genuine emotion expression. More generally, our results cast doubt on a direct mapping of vocal cues between perception and production in the domain of emotion expression.

