This paper presents a successful technique for creating a new language model (LM) that adapts the original target LM used by a machine translation (MT) system. This technique is especially useful for situations where there are very scarce resources for training the target side (Spanish Sign Language (LSE) in our case) in order to properly estimate the target LM, the Sign Language Model (SLM), used by the MT system. The technique uses information from the source language, Spanish in our task, and from the phrase-based translation matrix in order to create a new LM, estimated using web frequencies, which adapts the counts of the SLM through the Maximum A Posteriori method (MAP). The corpus consists of common used sentences spoken by an officer when assisting people in applying for, or renewing, the National Identification Document. The proposed technique allows relative reductions of 15.5% on perplexity and 2.7% on WER for translation, which are close to half the maximum performance obtainable when only the LM is optimized.
Bibliographic reference. D'Haro, Luis Fernando / San-Segundo, Ruben / Cordoba, Ricardo de / Bungeroth, Jan / Stein, Daniel / Ney, Hermann (2008): "Language model adaptation for a speech to sign language translation system using web frequencies and a MAP framework", In INTERSPEECH-2008, 2199-2202.