This paper investigates approaches to modeling the time evolution of short-time spectral features in paralinguistic speech type classification, where we focus on detection of speech influenced by physical exertion. The time series model consists of autoregressive processes of multiple time scales and orders and is trained to describe the long-term dynamics of a given target speech class. Themodel is applied in two ways in improving long-term modeling in the detection task: 1) to perform predictive filtering of the features and 2) to automatically select instantaneous classification subspaces. The spectrum analysis method underlying the short-time features is also varied between the standard discrete Fourier transform and a time-weighted linear predictive method which yields smooth all-pole spectrum envelope models. Configurations of the proposed methods are evaluated in the Physical Load task of the Interspeech 2014 Computational Paralinguistics Challenge and show improvement over the baseline timbral classifier and the challenge baseline. Also the interrelationships among the methods are discussed.
Bibliographic reference. Pohjalainen, Jouni / Alku, Paavo (2014): "Filtering and subspace selection for spectral features in detecting speech under physical stress", In INTERSPEECH-2014, 432-436.