Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

A Novel Language Model Based on Self-Organized Learning

Taiyi Huang, Langzhou Chen

Institute of Automation, Chinese Academy of Science, Beijing, China

Statistical language model is very important to speech recognition. To a system of special topic, domain dependent language model is much better than general model. There are two problems in traditional method to train topic dependent model: 1. The corpus of special topic is not as enough as general corpus. 2. An individual article always relates to more than one topics, traditional method has not considered this phenomena. This paper try to solve these two problems. We have present a new method to organize the corpus--the method based on fuzzy training subset. And the training of domain dependent models are based on these fuzzy subsets. At the same time, a self organized learning approach is introduced in training process to improve the modelsí predicting ability. The self organized learning can improve the performance of models evidently.

