INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network

Kartik Audhkhasi, Angeliki Metallinou, Ming Li, Shrikanth S. Narayanan

Signal Analysis and Interpretation Lab (SAIL), Electrical Engineering Department, University of Southern California, Los Angeles, CA, USA

Automatic classification of human personality along the Big Five dimensions is an interesting problem with several practical applications. This paper makes some contributions in this regard. First, we propose a few automatically-derived personality-discriminating lexical features which provide information complementary to the conventional acoustic-prosodic cues. We also design a frame-level Gaussian mixture model based system which adds complimentary information to the systems trained on global statistical functionals. Next, we note that the Big Five dimensions are correlated and thus model the dependency between these dimensions in the form of an optimal tree-structured Bayesian network. Our final sub-system consists of within class covariance normalization followed by L1-regularized logistic regression. Fusion of all these sub-systems achieves better classification performance than independently trained classifiers using just acoustic features.

Index Terms: Speaker Personality Classification, Bayesian Network Structure Learning, Gaussian Mixture Models, Within Class Covariance Normalization

Full Paper

Bibliographic reference.  Audhkhasi, Kartik / Metallinou, Angeliki / Li, Ming / Narayanan, Shrikanth S. (2012): "Speaker personality classification using systems based on acoustic-lexical cues and an optimal tree-structured Bayesian network", In INTERSPEECH-2012, 262-265.