15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

On the Use of Bhattacharyya Based GMM Distance and Neural Net Features for Identification of Cognitive Load Levels

Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma

A*STAR, Singapore

This paper presents a method for detecting cognitive load levels from speech. When speech is modulated by different levels of cognitive load, acoustic characteristics of speech change. In this paper, we measure acoustic distance of a stressed utterance from the baseline stress free speech using GMM-SVM kernel with Bhattacharyya based GMM distance. In addition, it is believed that airflow structure of speech production is nonlinear. This motivates us to investigate better techniques to capture nonlinear characteristic of stress information in acoustic features. Inspired by the recent success of neural networks for representation learning, we employ a single hidden layer feed forward network with non-linear activation to extract the feature vectors. Furthermore, people have different reactions to a particular task load. This inter-speaker difference in stress responses presents a major challenge for stress level detection. We use a bootstrapped training process to learn the stress response of a particular speaker. We perform experiments using data sets from Cognitive Load with Speech and EGG (CLSE) provided for the Cognitive Load Sub-Challenge of the INTERSPEECH 2014 Computational Paralinguistics Challenge. The results show that the system with our proposed strategies performs well on validation and test sets.

Full Paper

Bibliographic reference.  Nwe, Tin Lay / Nguyen, Trung Hieu / Ma, Bin (2014): "On the use of Bhattacharyya based GMM distance and neural net features for identification of cognitive load levels", In INTERSPEECH-2014, 736-740.