INTERSPEECH 2004 - ICSLP
This paper describes some recent experiments on unsupervised language model adaptation for transcription of broadcast news data. In previous work, a framework for automatically selecting adaptation data using information retrieval techniques was proposed. This work extends the method and presents experimental results with unsupervised language model adaptation. Three primary aspects are considered: (1) the performance of 5 widely used LM adaptation methods using the same adaptation data is compared; (2) the influence of the temporal distance between the training and test data epoch on the adaptation efficiency is assessed; and (3) show-based language model adaptation is compared with story-based language model adaptation. Experiments have been carried out for broadcast news transcription in English and Mandarin Chinese. A relative word error rate reduction of 4.7% was obtained in English and a 5.6% relative character error rate reduction in Mandarin with story-based MDI adapation.
Bibliographic reference. Chen, Langzhou / Lamel, Lori / Gauvain, Jean-Luc / Adda, Gilles (2004): "Dynamic language modeling for broadcast news", In INTERSPEECH-2004, 997-1000.