11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Prior Information for Rapid Speaker Adaptation

Catherine Breslin, K. K. Chin, Mark J. F. Gales, Kate Knill, Haitian Xu

Toshiba Research Europe Ltd., UK

Rapidly adapting a speech recognition system to new speakers using a small amount of adaptation data is important to improve initial user experience. In this paper, a count-smoothing framework for incorporating prior information is extended to allow for the use of different forms of dynamic prior and improve the robustness of transform estimation on small amounts of data. Prior information is obtained from existing rapid adaptation techniques like VTLN and PCMLLR. Results using VTLN as a dynamic prior for CMLLR estimation show that transforms estimated on just one utterance can yield relative gains of 15% and 46% over a baseline gender independent model on two tasks.

Full Paper

Bibliographic reference.  Breslin, Catherine / Chin, K. K. / Gales, Mark J. F. / Knill, Kate / Xu, Haitian (2010): "Prior information for rapid speaker adaptation", In INTERSPEECH-2010, 1644-1647.