11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Acoustic Vector Resampling for GMMSVM-Based Speaker Verification

Man-Wai Mak, Wei Rao

Hong Kong Polytechnic University, China

Using GMM-supervectors as the input to SVM classifiers (namely, GMM-SVM) is one of the promising approaches to text-independent speaker verification. However, one unaddressed issue is the severe imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique -- namely utterance partitioning with acoustic vector resampling (UP-AVR) -- to mitigate this problem. Specifically, the sequence order of acoustic vectors in an enrollment utterance is first randomized; then the randomized sequence is partitioned into a number of segments. Each segment is used to produce a GMM-supervector via MAP adaptation and mean-vector concatenation. A desirable number of speaker-class supervectors can be produced by repeating this process a number of times. Experimental evaluations suggest that UP-AVR can reduce the EER of GMM-SVM systems by about 10%.

Full Paper

Bibliographic reference.  Mak, Man-Wai / Rao, Wei (2010): "Acoustic vector resampling for GMMSVM-based speaker verification", In INTERSPEECH-2010, 1449-1452.