COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction
University of East Anglia, Norwich, UK
We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, the so called product rule with a novel adaptive reliability based weighting structure is employed. The proposed adaptive product rule is more robust in the presence of unreliable modalities, provided that the employed reliability measure is effective in assessment of classifier decisions. The proposed reliability measure, that genuinely fits to the open-set speaker identification problem, is used to assess more robust accept and reject decisions. Experimental results that support this assertion are provided.
Bibliographic reference. Erzin, Engin / Yemez, Yücel / Tekalp, A. Murat (2004): "Multimodal speaker identification using adaptive decision fusion with reliability weighted summation", In Robust2004, paper 16.