12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation

Takahiro Shinozaki (1), Yu Kubota (1), Sadaoki Furui (1), Eiji Utsunomiya (2), Yasutaka Shindoh (2)

(1) Tokyo Institute of Technology, Japan
(2) KDDI R&D Laboratories Inc., Japan

A general framework of language model task adaptation is to select documents in a large training set based on a language model estimated on a development data. However, this strategy has a deficiency that the selected documents are biased to the most frequent patterns in the development data. To address this problem, a new task adaptation method is proposed that selects documents in the training set so as to directly reduce the perplexity on the development set. Moreover, a weighting method to modify the perplexity objective function is proposed to improve the generalization to unseen data. The proposed adaptation methods are evaluated by large vocabulary speech recognition experiments. It is shown that the proposed adaptation with the weighting term produces a compact-size model that gives consistently lower word error rates for different tasks.

Full Paper

Bibliographic reference.  Shinozaki, Takahiro / Kubota, Yu / Furui, Sadaoki / Utsunomiya, Eiji / Shindoh, Yasutaka (2011): "Sentence selection by direct likelihood maximization for language model adaptation", In INTERSPEECH-2011, 613-616.