ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition

April 13-16, 2003
Tokyo Institute of Technology, Tokyo, Japan

Unsupervised Language Model Adaptation Using Word Classes for Spontaneous Speech Recognition

T. Yokoyama, T. Shinozaki, K. Iwano, Sadaoki Furui

Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan

This paper proposes an unsupervised, batch-type, class-based language model adaptation method for spontaneous speech recognition. The word classes are automatically determined by maximizing the average mutual information between the classes using a training set. A class-based language model is built based on recognition hypotheses obtained using a general word-based language model, and linearly interpolated with that general language model. All the input utterances are re-recognized using the adapted language model. It was confirmed that the proposed method is effective in improving the recognition accuracy in spontaneous presentation recognition. The proposed method was combined with acoustic model adaptation, and it was found that the effects of language model adaptation and acoustic model adaptation are additive. The optimum number of classes is 100 irrespective of whether the acoustic model adaptation is combined or not, and in this condition the language model adaptation yields approximately 2% absolute value improvement in the word accuracy.

Full Paper

Bibliographic reference.  Yokoyama, T. / Shinozaki, T. / Iwano, K. / Furui, Sadaoki (2003): "Unsupervised language model adaptation using word classes for spontaneous speech recognition", in SSPR-2003, paper MAP9.