International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

An Efficient Way to Learn Rules for Grapheme-to-Phoneme Conversion in Chinese

Zi-Rong Zhang, Min Chu, Eric Chang

Microsoft Research Asia, Beijing, China

Grapheme-to-phoneme (G2P) conversion is a very important component in a Text-to-Speech (TTS) system. Determining the pronunciation of polyphone characters is the main problem that the G2P component in a Mandarin TTS system faces. By studying the distribution of polyphones and their characteristics in a large text corpus with corrected pinyin transcriptions, this paper points out that correct G2P conversion for 41 key polyphones and 22 key polyphonic multi-syllabic words will constrain the overall error rate to below 0.068%. In this paper, the Extended Stochastic Complexity based stochastic decision list is used to learn rules for G2P conversion for these key polyphones and polyphonic words. With the generated rules, the error rate for G2P conversion decreased from 0.88% to 0.44%. Tagging corpus with correct pinyin for training and testing rules is a labor consuming and time consuming task. This paper also proposes a semi-automatic approach to do this, which saves almost half of the workload.

Full Paper

Bibliographic reference.  Zhang, Zi-Rong / Chu, Min / Chang, Eric (2002): "An efficient way to learn rules for grapheme-to-phoneme conversion in Chinese", In ISCSLP 2002, paper 59.