INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

High-Level Feature-Based Speaker Verification via Articulatory Phonetic-Class Pronunciation Modeling

Shi-Xiong Zhang (1), Man-Wai Mak (1), Helen Meng (2)

(1) Hong Kong Polytechnic University, China
(2) Chinese University of Hong Kong, China

Although articulatory feature-based conditional pronunciation models (AFCPMs) can capture the pronunciation characteristics of speakers, they requires one discrete density function for each phoneme, which may lead to inaccurate models when the amount of training data is limited. This paper proposes a phonetic-class based AFCPM in which the density functions in speaker models are conditioned on phonetic classes instead of phonemes. Phonemes are mapped to phonetic classes by (1) vector quantizing the phoneme-dependent universal background models, (2) grouping phonemes according to the classical phoneme tree, and (3) combination of (1) and (2). A new scoring method that uses an SVM to combine the scores of phonetic-class models is also proposed. Evaluations based on 2000 NIST SRE show that the proposed approach can effectively solve the data sparseness problem encountered in conventional AFCPM.

Full Paper

Bibliographic reference.  Zhang, Shi-Xiong / Mak, Man-Wai / Meng, Helen (2007): "High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling", In INTERSPEECH-2007, 762-765.