ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition
April 13-16, 2003
Evaluation of the emotional valence of speech cannot be done on the basis of first-order statistics concerning the fundamental frequency (F0), and the quantification of "pitch contours" has remained an intractable problem. We have addressed the issue of the relationship between emotion and F0 using a new technique for the extraction of the dominant tones within speech utterances and then the analysis of the interval structure. Our approach entails the summation of F0 over the entire utterance and calculation of the underlying pitch structure using an unsupervised "cluster" (radial basis function) algorithm. The technique normally results in 2-5 Gaussian "pitch clusters" per utterance that can then be evaluated in terms of their inherent dissonance and harmonic tension. We have found greater dissonance and greater harmonic tension in utterances with negative affect, relative to utterances with positive affect.
Bibliographic reference. Fujisawa, Takashi / Takami, Kazuaki / Cook, Norman D. (2003): "On the role of pitch intervals in the perception of emotional speech", in SSPR-2003, paper TAP17.