9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Machine Translation in Continuous Space

Ruhi Sarikaya, Yonggang Deng, Mohamed Afify, Brian Kingsbury, Yuqing Gao

IBM T.J. Watson Research Center, USA

We present a different perspective on the machine translation problem that relies upon continuous-space probabilistic models for words and phrases. Within this perspective we propose a method called Tied-Mixture Machine Translation (TMMT) that uses a trainable parametric model employing Gaussian mixture probability density functions to represent word- and phrase-pairs. In the new perspective, machine translation is treated in the same way as acoustic modeling in speech recognition. This new treatment carries several potential advantages that may improve state-of-theart machine translation systems, including better generalization to unseen events; adaptation to new domains, languages, genres, and speakers via methods such as Maximum-Likelihood Linear Regression (MLLR); and improved discrimination through discriminative training methods such as Maximum Mutual Information Estimation (MMIE). Our goal in this paper, however, is to introduce the new approach and demonstrate its viability, leaving investigation of some of the potential advantages to future work. To this end, we report some preliminary experiments demonstrating the viability of the proposed method.

Full Paper

Bibliographic reference.  Sarikaya, Ruhi / Deng, Yonggang / Afify, Mohamed / Kingsbury, Brian / Gao, Yuqing (2008): "Machine translation in continuous space", In INTERSPEECH-2008, 2350-2353.