Sixth International Conference on Spoken Language Processing
In our paper, we divide the corpus into 8 domains through text classification using K-means algorithm, and calculate the trigram LMs for each one. But the experiment shows the performance in some ones becomes worse. In order to solve this problem, we try to do the LM adaptation based on the domain LMs. The adaptation is done by mixing the domain LMs with the background LM by a linear interpolation. Relative word error rate reductions of between 5 and 10 % over the pruned background LM are achieved.
Bibliographic reference. Sun, Jiasong / Cui, Xiaodong / Wang, Zuoying / Liu, Yang (2000): "A language model adaptation approach based on text classification", In ICSLP-2000, vol.4, 516-519.