12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

Marcel Kockmann (1), Luciana Ferrer (2), Lukáš Burget (1), Jan Černocký (1)

(1) Brno University of Technology, Czech Republic
(2) SRI International, USA

In this paper we apply the promising iVector extraction technique followed by PLDA modeling to simple prosodic contour features. With this procedure we achieve results comparable to a system that models much more complex prosodic features using our recently proposed SMM-based iVector modeling technique. We then propose a combination of both prosodic iVectors by joint PLDA modeling that leads to significant improvements over individual systems with an EER of 5.4% on NIST SRE 2008 telephone data. Finally, we can combine these two prosodic iVector front ends with a baseline cepstral iVector system to achieve up to 21% relative reduction in new DCF.

Full Paper

Bibliographic reference.  Kockmann, Marcel / Ferrer, Luciana / Burget, Lukáš / Černocký, Jan (2011): "ivector fusion of prosodic and cepstral features for speaker verification", In INTERSPEECH-2011, 265-268.