In this paper, we investigate different methodologies of Arabic segmentation for statistical machine translation by comparing a rule-based segmenter to different statistically-based segmenters. We also present a new method for segmentation that serves the need for a real-time translation system without impairing the translation accuracy.
Cite as: Mansour, S. (2010) Morphtagger: HMM-based Arabic segmentation for statistical machine translation. Proc. International Workshop on Spoken Language Translation (IWSLT 2010), 321-327
@inproceedings{mansour10b_iwslt, author={Saab Mansour}, title={{Morphtagger: HMM-based Arabic segmentation for statistical machine translation}}, year=2010, booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2010)}, pages={321--327} }