Automatic emotion recognition can enhance evaluation of customer satisfaction and detection of customer problems in call centers. For this purpose emotion recognition is defined as binary classification for angry and non-angry on Turkish human-human call center conversations. We investigated both acoustic and language models for this task. Support Vector Machines (SVM) resulted in 82.9% accuracy whereas Gaussian Mixture Models (GMM) gave a slightly worse performance with 77.9%. In terms of the language modeling we compared word based, stem-only and stem+ending structures. Stem+ending based system resulted in higher accuracy with 72% using manual transcriptions. This can be mainly attributed to the agglutinative nature of Turkish language. When we fused the acoustic and LM classifiers using a Multi Layer Perceptron (MLP) we could achieve a 89% correct detection of both angry and non-angry classes.
Bibliographic reference. Erden, Mustafa / Arslan, Levent M. (2011): "Automatic detection of anger in human-human call center dialogs", In INTERSPEECH-2011, 81-84.