EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Speaker Model Selection Using Bayesian Information Criterion for Speaker Indexing and Speaker Adaptation

Masafumi Nishida (1), Tatsuya Kawahara (2)

(1) Japan Science and Technology Corporation, Japan
(2) Kyoto University, Japan

This paper addresses unsupervised speaker indexing for discussion audio archives. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the Bayesian Information Criterion (BIC) according to input utterances. The framework makes it possible to use a discrete model when the data is sparse, and to seamlessly switch to a continuous model after a large cluster is obtained. The speaker indexing is also applied and evaluated at automatic speech recognition of discussions by adapting a speaker-independent acoustic model to each participant. It is demonstrated that indexing with our method is sufficiently accurate for the speaker adaptation.

Full Paper

Bibliographic reference.  Nishida, Masafumi / Kawahara, Tatsuya (2003): "Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation", In EUROSPEECH-2003, 1849-1852.