ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Transformation enhanced multi-grained modeling for text-independent speaker recognition

Upendra V. Chaudhari, Jiri Navrátil, Stéphane H. Maes, Ramesh Gopinath

We describe our formulation of transformation enhanced data modeling used to develop a multi-grained data analysis approach to text independent speaker recognition. The broad goal is to address difficulties caused by sparse training and test data. First, our development of maximum likelihood transformation based recognition with diagonally constrained Gaussian mixture models is detailed. We give results to show its robustness to decreasing training data. Then using the these models as building blocks, a multigrained model structure is developed. For this, the training data must be labeled, e.g. with an HMM based phone labeler. A graduated phone class structure is then used to train the speaker model at various levels of detail. This structure is a tree with the root node containing all the phones. Subsequent levels partition the phones into increasingly finer grained linguistic classes. We demonstrate the effectiveness of the modeling with identification and verification experiments.


Cite as: Chaudhari, U.V., Navrátil, J., Maes, S.H., Gopinath, R. (2000) Transformation enhanced multi-grained modeling for text-independent speaker recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 298-301

@inproceedings{chaudhari00_icslp,
  author={Upendra V. Chaudhari and Jiri Navrátil and Stéphane H. Maes and Ramesh Gopinath},
  title={{Transformation enhanced multi-grained modeling for text-independent speaker recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 298-301}
}