12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Online Speaker Adaptation with Pre-Computed FMLLR Transformations

Volker Fischer, Siegfried Kunzmann

European Media Laboratory GmbH, Germany

This paper presents a memory efficient single pass speech recognizer that makes use of pre-computed FMLLR transformations for online speaker adaptation. For that purpose we apply unsupervised segment clustering to the training corpus, create a transformation matrix for each cluster, and train a text-independent Gaussian mixture classifier for cluster selection during runtime. We use the RWTH Aachen University open source speech recognition toolkit for evaluation and compare the results to a standard speaker adaptive two pass decoding strategy. Results indicate that the method improves single pass recognition in VTLN feature space almost without overhead due to cluster selection, and show a relative improvement of up to 15 percent over speaker adaptative decoding, if only little data is available for unsupervised online adaptation.

Full Paper

Bibliographic reference.  Fischer, Volker / Kunzmann, Siegfried (2011): "Online speaker adaptation with pre-computed FMLLR transformations", In INTERSPEECH-2011, 2569-2572.