September 22-25, 1997
In this paper, we present and compare two alternative post-processing approaches to generate rules decision for text-dependent speaker identification based on Gaussian Mixture Models (GMM). The first approach, a linear programming method, is used to minimize a cost on a combined scores obtained from the N-Best GMM output probabilities. The second, more heuristic, is based on combination of output score probabilities to generate a decision rules. Statistical tools have been developed to explore the relative importance of these approaches on recognition accuracy. Experiments on Spidre database are presented to show the effects of these two approaches on the speaker identification performance (including the number of the N-Best hypothesis and handset variability). The linear programming approach does not show any improvement, however, a combined statistical approaches has demonstrated an improvement of more than 11% comparing to our standard performance system.
Bibliographic reference. Tadj, Chakib / Dumouchel, Pierre / Fang, Yu (1997): "N-best GMM's for speaker identification", In EUROSPEECH-1997, 2295-2298.