15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Ensemble of Machine Learning Algorithms for Cognitive and Physical Speaker Load Detection

How Jing (1), Ting-Yao Hu (2), Hung-Shin Lee (1), Wei-Chen Chen (3), Chi-Chun Lee (3), Yu Tsao (1), Hsin-Min Wang (1)

(1) Academia Sinica, Taiwan
(2) National Taiwan University, Taiwan
(3) National Tsing Hua University, Taiwan

We present our methods and results on participating in the Interspeech 2014 Computational Paralinguistics ChallengE (ComParE) of which the goal is to detect certain type of load of a speaker using acoustic features. There are in total seven classification models contributing to our final prediction, namely, neural network with rectified linear unit and dropout (ReLUNet), conditional restricted Boltzmann machine (CRBM), logistic regression (LR), support vector machine (SVM), Gaussian discriminant analysis (GDA), k-nearest neighbors (KNN), and random forest (RF). When linearly blending the predictions of these models, we are able to get significant improvements over the challenge baseline.

Full Paper

Bibliographic reference.  Jing, How / Hu, Ting-Yao / Lee, Hung-Shin / Chen, Wei-Chen / Lee, Chi-Chun / Tsao, Yu / Wang, Hsin-Min (2014): "Ensemble of machine learning algorithms for cognitive and physical speaker load detection", In INTERSPEECH-2014, 447-451.