5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

N-Best GMM's for Speaker Identification

Chakib Tadj (1), Pierre Dumouchel (2), Yu Fang (3)

(1) Ecole de Technologie Superieure, Montreal, Canada (2) Centre de Recherche Informatique de Montreal, Montreal, Canada (3) Institut Universitaire de Technologie, Canada

In this paper, we present and compare two alternative post-processing approaches to generate rules decision for text-dependent speaker identification based on Gaussian Mixture Models (GMM). The first approach, a linear programming method, is used to minimize a cost on a combined scores obtained from the N-Best GMM output probabilities. The second, more heuristic, is based on combination of output score probabilities to generate a decision rules. Statistical tools have been developed to explore the relative importance of these approaches on recognition accuracy. Experiments on Spidre database are presented to show the effects of these two approaches on the speaker identification performance (including the number of the N-Best hypothesis and handset variability). The linear programming approach does not show any improvement, however, a combined statistical approaches has demonstrated an improvement of more than 11% comparing to our standard performance system.

Full Paper

Bibliographic reference.  Tadj, Chakib / Dumouchel, Pierre / Fang, Yu (1997): "N-best GMM's for speaker identification", In EUROSPEECH-1997, 2295-2298.