ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A rule-based approach to farsi language text-to-phoneme conversion

Mohammad Reza Sadigh, Hamid Sheikhzadeh, M. R. Jahangir, Arash Farzan

A conversion from orthographic (written) form to a phonetic transcription is the first stage in a text-to-speech system. In this study, algorithms are presented to facilitate the text-to-phoneme (TTP) conversion for the Farsi language. Using a lexicon of about 15000 base morphemes, word formation rules are investigated and implemented. Moreover, a word segmentation of the written sentence has to be done prior to any phonetic transcription of the text. Due to special form of Farsi orthography, the word segmentation process is a complicated one. To solve the problem, a fast and on-line algorithm and a more complicated off-line algorithm are presented. The overall performance of the TTP conversion is evaluated to be more than 90%.


Cite as: Sadigh, M.R., Sheikhzadeh, H., Jahangir, M.R., Farzan, A. (2000) A rule-based approach to farsi language text-to-phoneme conversion. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 532-535

@inproceedings{sadigh00_icslp,
  author={Mohammad Reza Sadigh and Hamid Sheikhzadeh and M. R. Jahangir and Arash Farzan},
  title={{A rule-based approach to farsi language text-to-phoneme conversion}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 532-535}
}