ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Discriminative training on language model

Zheng Chen, Kai-Fu Lee, Ming-jing Li

Statistical language models have been successfully applied to a lot of problems, including speech recognition, handwriting, Chinese pinyin-input etc. In recognition, a statistical language model, such as the trigram model, is used to predict the probabilities of hypothesized word sequences. The traditional method that relies on distribution estimation is sub-optimal when the assumed distribution form is inapplicable, and that "optimality" in distribution estimation does not automatically translate into "optimality" in classifier design. This paper proposed a discriminative training method to minimize the error rate of recognizer rather than estimate the distribution of training data. Furthermore, lexicon is also optimized to minimize the error rate of the decoder through discriminative training. Compared to the traditional LM building method, our system achieves a 5%-25% reduction in error rate.

Cite as: Chen, Z., Lee, K.-F., Li, M.-j. (2000) Discriminative training on language model. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 493-496

  author={Zheng Chen and Kai-Fu Lee and Ming-jing Li},
  title={{Discriminative training on language model}},
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 493-496}