7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper reports on emotion recognition using both acoustic and language information in spoken utterances. So far, most previous efforts have focused on emotion recognition using acoustic correlates although it is well known that language information also conveys emotions. For capturing emotional information at the language level, we introduce the information-theoretic notion of ‘emotional salience’. For acoustic information, linear discriminant classifiers and k-nearest neighborhood classifiers were used in the emotion classi- fication. The combination of acoustic and linguistic information is posed as a data fusion problem to obtain the combined decision. Results using spoken dialog data obtained from a telephone-based human-machine interaction application show that combining acoustic and language information improves negative emotion classification by 45.7% (linear discriminant classifier used for acoustic information) and 32.9%, respectively, over using only acoustic and language information.
Bibliographic reference. Lee, Chul Min / Narayanan, Shrikanth S. / Pieraccini, Roberto (2002): "Combining acoustic and language information for emotion recognition", In ICSLP-2002, 873-876.