9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Search and Classification Based Language Model Adaptation

Qin Shi (1), Stephen M. Chu (2), Wen Liu (1), Hong-Kwang Jeff Kuo (2), Yi Liu (1), Yong Qin (1)

(1) IBM China Research Lab, China; (2) IBM T.J. Watson Research Center, USA

Adaptation techniques in language modeling have shown growing potentials in improving speech recognition performance. For topic adaptation, a set of pre-defined topic-specific language models are typically used, and adaptation is achieved through adjusting the interpolation weights. However, mismatch between the test data and the pre-defined models inevitably exists and is left untreated in the static approach. Instead of tuning the parameters in the existing models, this paper describes a method that dynamically extracts relevant documents from training sources according to intermediate decoding hypotheses to build new targeted language models. Different from general search-based document collection, a new and effective ranking method is used here for candidate extraction. The targeted language models are interpolated with the static topic language models and a general language model, and used for lattice rescoring. The proposed adaptation technique is implemented in a state-of-the-art Mandarin broadcast transcription system, and evaluated on the GALE task. We show that static topic adaptation reduces the relative character error rate by 4.9%. It is further shown that the proposed dynamic adaptation technique attains an additional 10.3% reduction in error rate.

Full Paper

Bibliographic reference.  Shi, Qin / Chu, Stephen M. / Liu, Wen / Kuo, Hong-Kwang Jeff / Liu, Yi / Qin, Yong (2008): "Search and classification based language model adaptation", In INTERSPEECH-2008, 1578-1581.