EUROSPEECH 2003 - INTERSPEECH 2003
This paper presents a geometric constrained transformation approach for fast acoustic adaptation, which improves the modeling resolution of the conventional Maximum Likelihood Linear Regression (MLLR). For this approach, the underlying geometry difference between the seed and the target spaces is exposed and quantified, and used as a prior knowledge to reconstruct refiner transforms. Ignoring dimensions that have minor affections to this difference, the transform could be constrained to a lower rank subspace. And only distortions within this subspace are to be refined in a cascaded process. Compared to previous cascade method, we employ a different parameterization and obtain a higher resolution. At the same time, since the geometric span for refiner transforms is highly controlled, it could be adapted quickly. So, it could achieve a better tradeoff between resolution and robustness. In Mandarin dialect adaptations, this approach provides 4~9% word-error-rate relative decrease over MLLR and 3~5% over previous cascade method correspondingly with varying amounts of data.
Bibliographic reference. Zhang, Huayun / Xu, Bo (2003): "Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation", In EUROSPEECH-2003, 1465-1468.