9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Factored Translation Models for Enriching Spoken Language Translation with Prosody

Vivek Kumar Rangarajan Sridhar (1), Srinivas Bangalore (2), Shrikanth S. Narayanan (1)

(1) University of Southern California, USA
(2) AT&T Labs Research, USA

Key contextual information such as word prominence, emphasis, and contrast is typically ignored in speech-to-speech (S2S) translation due to the compartmentalized nature of the translation process. Conventional S2S systems rely on extracting prosody dependent cues from hypothesized (possibly erroneous) translation output using only words and syntax. In contrast, we propose the use of factored translation models to integrate the assignment and transfer of pitch accents (tonal prominence) during translation. We report experiments on 2 parallel corpora (Farsi-English and Japanese-English). The proposed factored translation models provide a relative improvement of 8.4% and 16.8% in pitch accent labeling accuracy over the post-processing approach for the two corpora respectively.

Full Paper

Bibliographic reference.  Sridhar, Vivek Kumar Rangarajan / Bangalore, Srinivas / Narayanan, Shrikanth S. (2008): "Factored translation models for enriching spoken language translation with prosody", In INTERSPEECH-2008, 2723-2726.