![]() |
ITRW on
|
![]() |
This paper reports on languagemodel adaptation for the broadcast news transcription task. Language model adaptation for this task is challenging in that the subject of any particular show or portion thereof is unknown in advance and is often related to more than one topic. One of the problems in language model adaptation is the extraction of reliable topic information from the audio signal, particularly in the presence of recognition errors. In this work, we draw upon techniques used in information retrieval to extract topic information from the word recognizer hypotheses, which are then used to automatically select adaptation data from a large general text corpus. Two adaptive language models, a mixture-based model and a MAP-based model, have been investigated using the adaptation data. Experiments carried out with the LIMSI Mandarin broadcast news transcription systemgives a relative character error rate reduction of 4.3% by combining both adaptation methods.
Bibliographic reference. Chen, Langzhou / Gauvain, Jean-Luc / Lamel, Lori / Adda, Gilles / Adda-Decker, Martine (2001): "Language model adaptation for broadcast news transcription", In Adaptation-2001, 195-198.