In our paper, we divide the corpus into 8 domains through text classification using K-means algorithm, and calculate the trigram LMs for each one. But the experiment shows the performance in some ones becomes worse. In order to solve this problem, we try to do the LM adaptation based on the domain LMs. The adaptation is done by mixing the domain LMs with the background LM by a linear interpolation. Relative word error rate reductions of between 5 and 10 % over the pruned background LM are achieved.
Cite as: Sun, J., Cui, X., Wang, Z., Liu, Y. (2000) A language model adaptation approach based on text classification. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 516-519, doi: 10.21437/ICSLP.2000-862
@inproceedings{sun00c_icslp, author={Jiasong Sun and Xiaodong Cui and Zuoying Wang and Yang Liu}, title={{A language model adaptation approach based on text classification}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 4, 516-519}, doi={10.21437/ICSLP.2000-862} }