Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

A Rule-Based Approach to Farsi Language Text-To-Phoneme Conversion

Mohammad Reza Sadigh (1,2), Hamid Sheikhzadeh (1), M. R. Jahangir (1,2), Arash Farzan (2)

(1) Dept. of Elect. Eng., Amirkabir Univ. of Tech. (Tehran Polytechnic), (2) PSA Co. Ltd., Tehran. Iran

A conversion from orthographic (written) form to a phonetic transcription is the first stage in a text-to-speech system. In this study, algorithms are presented to facilitate the text-to-phoneme (TTP) conversion for the Farsi language. Using a lexicon of about 15000 base morphemes, word formation rules are investigated and implemented. Moreover, a word segmentation of the written sentence has to be done prior to any phonetic transcription of the text. Due to special form of Farsi orthography, the word segmentation process is a complicated one. To solve the problem, a fast and on-line algorithm and a more complicated off-line algorithm are presented. The overall performance of the TTP conversion is evaluated to be more than 90%.

Full Paper

Bibliographic reference.  Sadigh, Mohammad Reza / Sheikhzadeh, Hamid / Jahangir, M. R. / Farzan, Arash (2000): "A rule-based approach to farsi language text-to-phoneme conversion", In ICSLP-2000, vol.1, 532-535.