ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Simple, lexicalized choice of translation timing for simultaneous speech translation

Tomoki Fujita, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Conventional speech translation systems wait until the end of the input sentence before starting translation, causing a large delay in the translation process. Methods have been proposed to reduce this delay by dividing the input utterance on pause boundaries, but while these methods have proven useful on speech translation of language pairs with similar word order, they are insensitive to linguistic information and less effective for languages that require more word reordering. In this work, we propose a method that uses lexicalized information to perform translation unit segmentation considering the relationship between the source and target languages. In particular, we use the phrase table and reordering probabilities used in phrase-based translation systems to decide points in the sentence where we can begin translation with less delay. Through an experimental evaluation, we confirmed that the proposed method significantly reduces delay for Japanese-English and French-English translation. We also show that a parameter introduced in our model can adjust the trade-off between simultaneity and accuracy, and that in situations that require a large degree of simultaneity, our system can achieve a delay reduction of 20% compared to pause segmentation with identical accuracy.


doi: 10.21437/Interspeech.2013-615

Cite as: Fujita, T., Neubig, G., Sakti, S., Toda, T., Nakamura, S. (2013) Simple, lexicalized choice of translation timing for simultaneous speech translation. Proc. Interspeech 2013, 3487-3491, doi: 10.21437/Interspeech.2013-615

@inproceedings{fujita13_interspeech,
  author={Tomoki Fujita and Graham Neubig and Sakriani Sakti and Tomoki Toda and Satoshi Nakamura},
  title={{Simple, lexicalized choice of translation timing for simultaneous speech translation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3487--3491},
  doi={10.21437/Interspeech.2013-615}
}