5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

A Comparison of Novel Techniques for Instantaneous Speaker Adaptation

Timothy J. Hazen, James R. Glass

Spoken Language Systems Group Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

This paper introduces two novel techniques for instantaneous speaker adaptation, reference speaker weighting and consistency modeling. An approach to hierarchical speaker clustering using gender and speaking rate as the clustering criteria is also presented. All three methods attempt to utilize the underlying within-speaker correlations that are present between the acoustic realizations of different phones. By accounting for these correlations a limited amount of adaptation data can be used to adapt the models of every phonetic acoustic model including those for phones which have not been observed in the adaptation data. In instantaneous adaptation experiments using the DARPA Resource Management corpus, a reduction in word error rate of 20% has been achieved using a combination of these new techniques.

Full Paper

Bibliographic reference.  Hazen, Timothy J. / Glass, James R. (1997): "A comparison of novel techniques for instantaneous speaker adaptation", In EUROSPEECH-1997, 2047-2050.