International Workshop on Spoken Language Translation (IWSLT) 2012
In spoken language translation (SLT), finding proper segmentation
and reconstructing punctuation marks are not only
significant but also challenging tasks. In this paper we
present our recent work on speech translation quality analysis
for German-English by improving sentence segmentation
From oracle experiments, we show an upper bound of translation quality if we had human-generated segmentation and punctuation on the output stream of speech recognition systems. In our oracle experiments we gain 1.78 BLEU points of improvements on the lecture test set. We build a monolingual translation system from German to German implementing segmentation and punctuation prediction as a machine translation task. Using the monolingual translation system we get an improvement of 1.53 BLEU points on the lecture test set, which is a comparable performance against the upper bound drawn by the oracle experiments.
Bibliographic reference. Cho, Eunah / Niehues, Jan / Waibel, Alex (2012): "Segmentation and punctuation prediction in speech language translation using a monolingual translation system", In IWSLT-2012, 252-259.