Sixth International Conference on Spoken Language Processing
In this paper we introduce an approach to transformation based model adaptation techniques. Previously published schemes like MLLR define a set of affine transformations to be applied on clusters of model parameters. Although it has been shown that this approach can yield good results when adaptation data is scarce, an inherent problem needs to be considered: the number of transformations used has a significant influence on the adaptation performance. Using too many transformations will result in poorly estimated transformation parameters, eventually leading to a model that overfits the adaptation data. On the other hand, when too few transformations are used, a restricted mapping is obtained, leading to a suboptimal adapted model. We address this problem by estimating the transform parameters in a maximum a posteriori sense, using a set of hierarchical priors arranged in a tree structure. We show that this approach yields a significant improvement compared to MLLR when doing unsupervised model adaptation on the WSJ spoke 3 test.
Bibliographic reference. Myrvoll, Tor André / Siohan, Olivier / Lee, Chin-Hui / Chou, Wu (2000): "Structural maximum a-posteriori linear regression for unsupervised speaker adaptation", In ICSLP-2000, vol.4, 540-543.