9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Analysis of Voice-Quality Features of Speech That Expresses "Anger", "Joy", and "Sadness" Uttered by Radio Actors and Actresses

Shoichi Takeda (1), Yuuri Yasuda (2), Risako Isobe (3), Shogo Kiryu (3), Makiko Tsuru (1)

(1) Kinki University, Japan; (2) Osaka Gas Information System Research Institute Co. Ltd., Japan; (3) Musashi Institute of Technology, Japan

This paper describes the analysis of the voice-quality features of "anger", "joy", and "sadness" depending on the degree of the emotion for expressions in Japanese speech. The degrees of emotion were "neutral", "light", "medium" and "strong". Among voice-quality features, we turned to the noise level of the glottalflow waveform. We adopted the AR model and measured the noise levels of the predictive residual signal of speech that expressed each emotion. To measure a relative noise level to the signal level, the "noise-to-signal (N/S) ratio" was introduced. The analysis results showed that the relative noise levels in the residual-waveform spectra were different, i.e., the N/S ratio of each emotion was larger in the order of "anger" > "sadness". "neutral" > "joy" by approximately 4 dB.

