In previous work [1], it was investigated how the neighborhood can be used to estimate a better model for a speaker when few training data is avalaible. In this paper, this work is completed by investigating another way to merge models from the neighbors and by introducing a weight on the neighbor models to be merged. Experiments on a telephone speech database show that using the neighborhood-merged model to initialize the training phase provides improvement compared to the UBM approach, when few training data is available.
Y. Mami and D. Charlet, "Speaker modeling from selected neighbors applied to speaker recognition," in Eurospeech, Geneva, Switzerland, 2003.
Cite as: Charlet, D. (2004) Neighborhood-adapted GMM for speaker recognition. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 227-230
@inproceedings{charlet04_odyssey, author={Delphine Charlet}, title={{Neighborhood-adapted GMM for speaker recognition}}, year=2004, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2004)}, pages={227--230} }