![]() |
INTERSPEECH 2011
|
![]() |
This paper presents a memory efficient single pass speech recognizer that makes use of pre-computed FMLLR transformations for online speaker adaptation. For that purpose we apply unsupervised segment clustering to the training corpus, create a transformation matrix for each cluster, and train a text-independent Gaussian mixture classifier for cluster selection during runtime. We use the RWTH Aachen University open source speech recognition toolkit for evaluation and compare the results to a standard speaker adaptive two pass decoding strategy. Results indicate that the method improves single pass recognition in VTLN feature space almost without overhead due to cluster selection, and show a relative improvement of up to 15 percent over speaker adaptative decoding, if only little data is available for unsupervised online adaptation.
Bibliographic reference. Fischer, Volker / Kunzmann, Siegfried (2011): "Online speaker adaptation with pre-computed FMLLR transformations", In INTERSPEECH-2011, 2569-2572.