![]() |
ITRW on Speech and EmotionSeptember 5-7, 2000 |
![]() |
This paper discusses the possibilities to extract features from the speech signal that can be used for the detection of emotional state of the speaker, using the ASR framework.
After the introduction, a short overview of the ASR framework is presented. Next, we discuss the relation between recognition of emotion and ASR, and the different approaches found in the literature to tackle the correspondence between emotions and acoustic features. The conclusion is that emotion itself will be very difficult to predict with high accuracy, but in ASR general prosodic information is potentially powerful to improve the (word) accuracy for tasks on a limited domain.
Bibliographic reference. Bosch, Louis ten (2000): "Emotions: What is possible in the ASR framework", Invited review paper, In SpeechEmotion-2000, 189-194.