ISCA Archive IWSLT 2010
ISCA Archive IWSLT 2010

Morphtagger: HMM-based Arabic segmentation for statistical machine translation

Saab Mansour

In this paper, we investigate different methodologies of Arabic segmentation for statistical machine translation by comparing a rule-based segmenter to different statistically-based segmenters. We also present a new method for segmentation that serves the need for a real-time translation system without impairing the translation accuracy.


Cite as: Mansour, S. (2010) Morphtagger: HMM-based Arabic segmentation for statistical machine translation. Proc. International Workshop on Spoken Language Translation (IWSLT 2010), 321-327

@inproceedings{mansour10b_iwslt,
  author={Saab Mansour},
  title={{Morphtagger: HMM-based Arabic segmentation for statistical machine translation}},
  year=2010,
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2010)},
  pages={321--327}
}