Speech Prosody 2004

Nara, Japan
March 23-26, 2004

Application of a Psychoacoustical Model of Harmony to Speech Prosody

Norman D. Cook, Takashi Fujisawa, Kazuaki Takami

Department of Informatics, Kansai University, Osaka, Japan

We have studied the prosody of emotional speech using a psychoacoustical model of musical harmony (designed to explain the basic facts of the perception of pitch combinations: interval consonance/dissonance and chordal harmony/tension). For any voiced utterance, the model provides 4 quasi-musical measures: dissonance, tension, total harmonic "instability", and "modality" of the pitches used. Modality is the most interesting, as it relates to the major and minor modes of traditional harmony theory and their characteristic positive and negative affect. In a study of emotional speech using 216 utterances, factor analysis showed that these measures are distinct from those obtained from basic statistics on the fundamental frequency of the voice (mean F0, range, rate of change, etc.). Moreover, there was a significant correlation between the major/minor modality measure and the positive/ negative affect of the utterance. We argue that, in addition to the traditional acoustical measures, a harmony measure is essential for determining the affective character of the tone of voice.

