In the study of expressive speech communication, it is commonly accepted that the emotion perceived by the listener is a good approximation of the intended emotion conveyed by the speaker. This paper analyzes the validity of this assumption by comparing the mismatches between the assessments made by naive listeners and by the speakers that generated the data. The analysis is based on the hypothesis that people are better decoders of their own emotions. Therefore, self-assessments will be closer to the intended emotions. Using the IEMOCAP database, discrete (categorical) and continuous (attribute) emotional assessments evaluated by the actors and naive listeners are compared. The results indicate that there is a mismatch between the expression and perception of emotion. The speakers in the database assigned their own emotions to more specific emotional categories, which led to more extreme values in the activation-valence space.
Bibliographic reference. Busso, Carlos / Narayanan, Shrikanth S. (2008): "The expression and perception of emotions: comparing assessments of self versus others", In INTERSPEECH-2008, 257-260.