8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Stochastic Gradient Adaptation of Front-End Parameters

Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goe

IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

This paper examines how any parameter in the typical front end of a speech recognizer, can be rapidly and inexpensively adapted with usage. It focusses on firstly demonstrating that effective adaptation can be accomplished using low CPU/Memory cost stochastic gradient descent methods, secondly showing that adaptation can be done at time scales small enough to make it effective with just a single utterance, and lastly showing that using a prior on the parameter significantly improves adaptation performance on small amounts of data. It extends previous work on stochastic gradient descent implementation of fMLLR and work on adapting any parameter in the front-end chain using general 2nd order opimization techniques. The framework for general stochastic gradient descent of any frontend parameter with a prior is presented, along with practical techniques to improve convergence. In addition the methods for obtaining the alignment at small time intervals before the end of the utterance are presented. Finally it shown that experimentally online causal adaptation can result in a 5-15% WER reduction across a variety of problems sets and noise conditions, even with just 1 or 2 utterances of adaptation data.

Full Paper

Bibliographic reference.  Balakrishnan, Sreeram / Visweswariah, Karthik / Goe, Vaibhava (2004): "Stochastic gradient adaptation of front-end parameters", In INTERSPEECH-2004, 1-4.