Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Text-Independent Speaker Identification Using Gaussian Mixture Bigram Models

Wei-Ho Tsai (1,2), Chiwei Che (1), Wen-Whei Chang (2)

(1) Philips Research East Asia-Taipei, Taiwan
(2) Department of Communication Engineering, Chiao Tung University, Hsinchu, Taiwan

In this paper, a novel speaker modeling technique based on Gaussian mixture bigram model (GMBM) is introduced and evaluated for text-independent speaker identification (speaker-ID). GMBM is a stochastic framework that explores the context or time dependency of continuous observations from an information source. In view of the fact that speech features are correlated between successive frames, we attempt to investigate if speaker-ID can be aided by modeling the spectral correlation in speech through the usage of GMBMs. The proposed method was evaluated on a 100-speaker speech database. Experimental results demonstrated that the error rate of speaker-ID could be greatly reduced by using GMBMs, compared to the conventional speaker-ID technique based on Gaussian mixture models (GMMs).


Full Paper

Bibliographic reference.  Tsai, Wei-Ho / Che, Chiwei / Chang, Wen-Whei (2000): "Text-independent speaker identification using Gaussian mixture bigram models", In ICSLP-2000, vol.2, 314-317.