This paper proposes an articulatory feature-based conditional pronunciation modeling (AFCPM) technique for speaker verification. The technique models the pronunciation behaviors of speakers by creating a link between the actual phones produced by the speakers and the state of articulations during speech production. Speaker models consisting of conditional probabilities of two articulatory classes are adapted from a set of universal background models (UBMs) using MAP adaptation technique. This adaptation approach aims to prevent over-fitting the speaker models when the amount of speaker data is insufficient for a direct estimation. Experimental results show that the adaptation technique can enhance the discriminating power of speaker models by establishing a tighter coupling between speaker models and the UBMs. Results also show that fusing the scores derived from an AFCPM-based system and a conventional spectral-based system achieves a significantly lower error rate than that of the individual systems. This suggests that AFCPM and spectral features are complementary to each other.
Cite as: Leung, K., Mak, M., Siu, M., Kung, S. (2004) Adaptive Conditional Pronunciation Modeling using Articulatory Features for Speaker Verification. Proc. International Symposium on Chinese Spoken Language Processing, 61-64
@inproceedings{leung04_iscslp, author={KaYee Leung and ManWai Mak and Manhung Siu and SunYuan Kung}, title={{Adaptive Conditional Pronunciation Modeling using Articulatory Features for Speaker Verification}}, year=2004, booktitle={Proc. International Symposium on Chinese Spoken Language Processing}, pages={61--64} }