September 22-25, 1997
Adaptation of language models to the specific subject domains is definitely important for real speech recognition applications. In this paper, a Chinese language model adaptation approach is presented mainly based on document classification and multiple domain- specific language models. The proposed document classification method using the perplexity value and word bigram coverage value as primary measures are able to model word associations and syntactic behavior in classifying documents into the clusters and thus creates more effective domain-specific language models. The adaptation of language model in speech recognition can be therefore effectively achieved by the proper selection of the most appropriated domain-specific language model. Preliminary tests have been made in application to Mandarin speech recognition and shown its exciting performance of the proposed approach in creating real applications.
Bibliographic reference. Lin, Sung-Chien / Tsai, Chi-Lung / Chien, Lee-Feng / Chen, Ker-Jiann / Lee, Lin-Shan (1997): "Chinese language model adaptation based on document classification and multiple domain-specific language models", In EUROSPEECH-1997, 1463-1466.