12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Conditioned Hidden Markov Model Fusion for Multimodal Classification

Michael Glodek, Stefan Scherer, Friedhelm Schwenker

Universität Ulm, Germany

Classification using hidden Markov models (HMM) is in general done by comparing the model likelihoods and choosing the class more likely to have generated the data. This work investigates a conditioned HMM which additionally provides a probability for a class label and compares different fusion strategies. The notion is two-fold: on the one hand applications in affective computing might pass their uncertainty of the classification to the next processing unit, on the other hand different streams might be fused to increase the performance. The data set studied incorporates two modalities and is based on a naturalistic multiparty dialogue. The goal is to discriminate between laughter and utterances. It turned out that the conditioned HMM outperforms classical HMM using different late fusion approaches while additionally providing a certainty about class decision.

Full Paper

Bibliographic reference.  Glodek, Michael / Scherer, Stefan / Schwenker, Friedhelm (2011): "Conditioned hidden Markov model fusion for multimodal classification", In INTERSPEECH-2011, 2269-2272.