Speaker space based adaptation methods for automatic speech recognition have been shown to provide significant performance improvements for tasks where only a few seconds of adaptation speech is available. This paper proposes a robust, low complexity technique within this general class that has been shown to reduce word error rate, reduce the large storage requirements associated with speaker space approaches, and eliminate the need for large numbers of utterances per speaker in training. The technique is based on representing speakers as a linear combination of clustered linear basis vectors and a procedure is presented for ML estimation these vectors from training data. Significant word error rate reduction was obtained relative to speaker independent performance for the Resource Management and Wall Street Journal task domains.
Bibliographic reference. Tang, Yun / Rose, Richard (2007): "Clustered maximum likelihood linear basis for rapid speaker adaptation", In INTERSPEECH-2007, 254-257.