ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Using syntax in large-scale audio document translation

Jing Zheng, Necip Fazil Ayan, Wen Wang, David Burkett

Recently, the use of syntax has very effectively improved machine translation (MT) quality in many text translation tasks. However, using syntax in speech translation poses additional challenges because of disfluencies and other spoken language phenomena, and of errors introduced by automatic speech recognition (ASR). In this paper, we investigate the effect of using syntax in a large-scale audio document translation task targeting broadcast news and broadcast conversations. We do so by comparing the performance of three synchronous context-free grammar based translation approaches: 1) hierarchical phrase-based translation, 2) syntaxaugmented MT, and 3) string-to-dependency MT. The results show a positive effect of explicitly using syntax when translating broadcast news, but no benefit when translating broadcast conversations. The results indicate that improving the robustness of syntactic systems against conversational language style is important to their success and requires future effort.

doi: 10.21437/Interspeech.2009-158

Cite as: Zheng, J., Ayan, N.F., Wang, W., Burkett, D. (2009) Using syntax in large-scale audio document translation. Proc. Interspeech 2009, 440-443, doi: 10.21437/Interspeech.2009-158

  author={Jing Zheng and Necip Fazil Ayan and Wen Wang and David Burkett},
  title={{Using syntax in large-scale audio document translation}},
  booktitle={Proc. Interspeech 2009},