The speech signal contains information that characterises the speaker and the phonetic content, together with the emotion being expressed. This paper looks at the effect of this speakerand phoneme-specific information on speech-based automatic emotion classification. The performances of a classification system using established acoustic and prosodic features for different phonemes are compared, in both speaker-dependent and speakerindependent modes, using the LDC Emotional Prosody speech corpus. Results from these evaluations indicate that speaker variability is more significant than phonetic variations. They also suggest that some phonemes are easier to classify than others.
Bibliographic reference. Sethu, Vidhyasaharan / Ambikairajah, Eliathamby / Epps, Julien (2008): "Phonetic and speaker variations in automatic emotion classification", In INTERSPEECH-2008, 617-620.