8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

The Influence of Utterance Chunking on Machine Translation Performance

Christian Fügen, Muntsin Kolss

Universität Karlsruhe (TH), Germany

Speech translation systems commonly couple automatic speech recognition (ASR) and machine translation (MT) components. Hereby the automatic segmentation of the ASR output for the subsequent MT is critical for the overall performance. In simultaneous translation systems, which require a continuous output with a low latency, chunking of the ASR output into translatable segments is even more critical. This paper addresses the question how utterance chunking influences machine translation performance in an empirical study. In addition, the machine translation performance is also set in relation to the segment length produced by the chunking strategy, which is important for simultaneous translation. Therefore, we compare different chunking/segmentation strategies on speech recognition hypotheses as well as on reference transcripts.

Full Paper

Bibliographic reference.  Fügen, Christian / Kolss, Muntsin (2007): "The influence of utterance chunking on machine translation performance", In INTERSPEECH-2007, 2837-2840.