ITRW on
Adaptation Methods for Speech Recognition

August 29-30, 2001
Sophia Antipolis, France

Language Model Adaptation for Broadcast News Transcription

Langzhou Chen, Jean-Luc Gauvain, Lori Lamel, Gilles Adda and Martine Adda

Spoken Language Processing Group, LIMSI-CNRS, Orsay, France

This paper reports on languagemodel adaptation for the broadcast news transcription task. Language model adaptation for this task is challenging in that the subject of any particular show or portion thereof is unknown in advance and is often related to more than one topic. One of the problems in language model adaptation is the extraction of reliable topic information from the audio signal, particularly in the presence of recognition errors. In this work, we draw upon techniques used in information retrieval to extract topic information from the word recognizer hypotheses, which are then used to automatically select adaptation data from a large general text corpus. Two adaptive language models, a mixture-based model and a MAP-based model, have been investigated using the adaptation data. Experiments carried out with the LIMSI Mandarin broadcast news transcription systemgives a relative character error rate reduction of 4.3% by combining both adaptation methods.

Full Paper

Bibliographic reference.  Chen, Langzhou / Gauvain, Jean-Luc / Lamel, Lori / Adda, Gilles / Adda-Decker, Martine (2001): "Language model adaptation for broadcast news transcription", In Adaptation-2001, 195-198.