8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Fusing Acoustic, Phonetic and Data-Driven Systems for Text-Independent Speaker Verification

Asmaa El Hannani (1), Dijana Petrovska-Delacrétaz (2)

(1) University of Fribourg, Switzerland
(2) INT, France

This paper describes our recent efforts in exploring data-driven high-level features and their combination with low-level spectral features for speaker verification. In particular, we compare the phonetic and data-driven approaches and study their complementarity with short-term acoustic approach. Our objective is to show that data-driven units automatically acquired from the speech data, can be used like phonemes to extract high-level features and to bring complementary speaker-specific information that can therefore provide improvements when fused with acoustic systems. Results obtained on the NIST 2006 Speaker Recognition Evaluation data show that the combination of the phonetic, data-driven and Gaussian Mixture Models (GMM) systems brings a 27% relative reduction of the EER in comparison to the baseline GMM system.

Full Paper

Bibliographic reference.  Hannani, Asmaa El / Petrovska-Delacrétaz, Dijana (2007): "Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification", In INTERSPEECH-2007, 1230-1233.