ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Recognizing emotions in speech using short-term and long-term features

Yang Li, Yunxin Zhao

The acoustic characteristics of speech are influenced by speakers' emotional status. In this study, we attempted to recognize the emotional status of individual speakers by using speech features that were extracted from short-time analysis frames as well as speech features that represented entire utterances. Principal component analysis was used to analyze the importance of individual features in representing emotional categories. Three classification methods including vector quantization, artificial neural networks and Gaussian mixture density model were used. Classifications using short-term features only, long-term features only and both short-term and long-term features were conducted. The best recognition performance of 62% accuracy was achieved by using the Gaussian mixture density method with both short-term and long-term features.


doi: 10.21437/ICSLP.1998-560

Cite as: Li, Y., Zhao, Y. (1998) Recognizing emotions in speech using short-term and long-term features. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0379, doi: 10.21437/ICSLP.1998-560

@inproceedings{li98b_icslp,
  author={Yang Li and Yunxin Zhao},
  title={{Recognizing emotions in speech using short-term and long-term features}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0379},
  doi={10.21437/ICSLP.1998-560}
}