EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Frequency Distribution Based Weighted Sub-Band Approach for Classification of Emotional/Stressful Content in Speech

Mandar A. Rahurkar, John H.L. Hansen

University of Colorado at Boulder, USA

In this paper we explore the use of nonlinear Teager Energy Operator based features derived from multi-resolution sub-band analysis for classification of emotional/stressful speech. We propose a novel scheme for automatic sub-band weighting in an effort towards developing a generic algorithm for understanding emotion or stress in speech. We evaluate the proposed algorithm using a corpus of audio material from a military stressful Soldier of the Quarter Board evaluation panel. We establish classification performance of emotional/stressful speech using an open speaker set with open test tokens. With the new frequency distribution based scheme, we obtain a relative detection error reduction of series 81.3% in stress speech, and a series 75.4% relative detection rate reduction in neutral speech detection error rate. The results suggest a important step forward in establishing an effective processing scheme for developing generic models of neutral and emotional speech.

Full Paper

Bibliographic reference.  Rahurkar, Mandar A. / Hansen, John H.L. (2003): "Frequency distribution based weighted sub-band approach for classification of emotional/stressful content in speech", In EUROSPEECH-2003, 721-724.