COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction

University of East Anglia, Norwich, UK
August 30-31, 2004

Multimodal Speaker Identification using Adaptive Decision Fusion with Reliability Weighted Summation

Engin Erzin, Yücel Yemez, A. Murat Tekalp

Multimedia, Vision and Graphics Laboratory, College of Engineering, Koc University, Sariyer, Istanbul, Turkey

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, the so called product rule with a novel adaptive reliability based weighting structure is employed. The proposed adaptive product rule is more robust in the presence of unreliable modalities, provided that the employed reliability measure is effective in assessment of classifier decisions. The proposed reliability measure, that genuinely fits to the open-set speaker identification problem, is used to assess more robust accept and reject decisions. Experimental results that support this assertion are provided.

