Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Finite-State Models for Lexical Reordering in Spoken Language Translation

Srinivas Bangalore, Giuseppe Riccardi

AT&T Labs - Research, Florham Park, NJ, USA

The problem of machine translation can be viewed as consisting of two phases: (a) lexical choice phase where appropriate target language lexical items (words or phrases) are chosen for each source language lexical item and (b) reordering phase where the chosen target language lexical items are reordered to produce a meaningful target language string. In earlier work we have shown that finite-state models for lexical choice can be learned from bilingual corpora [1]. In this paper, we focus on stochastic finite-state models for lexical reordering and describe an algorithm to learn them from bilingual corpora. We have developed a stochastic finite-state English-Japanese translation system by composing finite-state lexical choice and lexical reordering model. We have evaluated it using the string edit distance of the translated string from a given reference string. Using this metric, the English-Japanese translation system scored 70.9% on English speech transcriptions.

Reference

  1. Srinivas Bangalore and Giuseppe Riccardi. Stochastic finite-state models for spoken language machine translation. In Proceedings of the Workshop on Embedded Machine Translation Systems, pages 52-59, 2000.


Full Paper

Bibliographic reference.  Bangalore, Srinivas / Riccardi, Giuseppe (2000): "Finite-state models for lexical reordering in spoken language translation", In ICSLP-2000, vol.4, 422-425.