Combining automatic speech recognition and machine translation is frequent in current research programs. This paper first presents several pre-processing steps to limit the performance degradation observed when translating an automatic transcription (as opposed to a manual transcription). Indeed, automatically transcribed speech often differs significantly from the machine translation system's training material, with respect to caseing, punctuation and word normalization. The proposed system outperforms the best system at the 2007 TC-STAR evaluation by almost 2 points BLEU. The paper then attempts to determine a criteria characterizing how well an STT system can be translated, but the current experiments could only confirm that lower word error rates lead to better translations.
Bibliographic reference. Déchelotte, Daniel / Schwenk, Holger / Adda, Gilles / Gauvain, Jean-Luc (2007): "Improved machine translation of speech-to-text outputs", In INTERSPEECH-2007, 2441-2444.