Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Using Maximum Likelihood Linear Regression for Segment Clustering and Speaker Identification

Michiel Bacchiani

AT&T Labs-Research, Florham Park, NJ, USA

Many adaptation scenarios rely on clustering of either the test or training data. Although consistency between the clustering and adaptation objective functions is desired, most previous approaches have not implemented such consistency. This paper shows that the statistics used in Maximum Likelihood Linear Regression (MLLR) adaptation are sucient to cluster data with a consistent Maximum Likelihood (ML) criterion. In addition, as the algorithm uses the same statistics for both adaptation and clustering, it is computationally ecient. Clustering experiments contrasting the performance of this algorithm with the widely used text independent Gaussian mixture model approach show increased adaptation likelihoods and consistency of within-cluster speaker identity. In a speaker identification experiment the adaptation-based scoring showed improved classi fication performance compared to the mixture model-based scoring.


Full Paper

Bibliographic reference.  Bacchiani, Michiel (2000): "Using maximum likelihood linear regression for segment clustering and speaker identification", In ICSLP-2000, vol.4, 536-539.