2001: A Speaker Odyssey - The Speaker Recognition Workshop

June 18-22, 2001
Crete, Greece

Speaker verification based on broad phonetic categories

Sachin S. Kajarekar (1), Hynek Hermansky (1,2)

(1) Oregon Graduate Institute of Science and Technology, Oregon, USA
(2) International Computer Science Institute, Berkeley, CA, USA

In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability in speech. Using NIST speaker verification evaluation data, the performance of the phone based system is compared with the conventional speaker verification system based on Gaussian mixture model (GMM). The results show that the phone-based system outperforms the conventional system specifically when there is channel mismatch between training and testing data.

Full Paper

Bibliographic reference.  Kajarekar, Sachin S. / Hermansky, Hynek (2001): "Speaker verification based on broad phonetic categories", In ODYSSEY-2001, 201-206.