The problem of how to estimate variance parameters in client models from scarce data is addressed in the context of text-dependent, HMM-based, automatic speaker verification. Variance flooring and variance scaling are investigated as two alternative estimation techniques and are used with or without variance tying on the state level to reduce the number of parameters to estimate. The best results are achieved with no tying and a variance flooring method where the floor to a variance vector in a client model is proportional to the corresponding variance vector in a gender-dependent, multi-speaker, non-client model. Further, variance tying reduces storage requirements considerably without much loss in recognition accuracy. It is also confirmed from a previous study that re-using non-client variances has comparable performance to variance flooring and is much simpler. Comparisons are made on three large telephone quality speech corpora.
Cite as: Melin, H., Lindberg, J. (1999) Variance flooring, scaling and tying for text-dependent speaker verification. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1975-1978, doi: 10.21437/Eurospeech.1999-435
@inproceedings{melin99_eurospeech, author={H. Melin and Johan Lindberg}, title={{Variance flooring, scaling and tying for text-dependent speaker verification}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={1975--1978}, doi={10.21437/Eurospeech.1999-435} }