4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
This paper explores several statistical pattern recognition techniques to classify utterances according to their emotional content. We have recorded a corpus containing emotional speech with over a 1000 utterances from different speakers. We present a new method of extracting prosodic features from speech, based on a smoothing spline approximation of the pitch contour. To make maximal use of the limited amount of training data available, we introduce a novel pattern recognition technique: majority voting of subspace specialists. Using this technique, we obtain classification performance that is close to human performance on the task.
Bibliographic reference. Dellaert, Frank / Polzin, Thomas / Waibel, Alex (1996): "Recognizing emotion in speech", In ICSLP-1996, 1970-1973.