Vowelization presents a principle difficulty in building text-to-speech synthesizers for speech-to-speech translation systems. In this paper, a novel log-linear modeling method is proposed that takes into account vowel and diacritical information at both the word level and character level. A unique syllable based normalization algorithm is then introduced to enhance both word coverage and data consistency. A recursive data generation and model training scheme is further devised to jointly optimize speech synthesizers and vowelizers for an English-Arabic speech translation system. The diacritization error rate is reduced by over 50% in vowelization experiments.
Bibliographic reference. Gu, Liang / Zhang, Wei / Tahir, Lazkin / Gao, Yuqing (2007): "Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems", In INTERSPEECH-2007, 1901-1904.