8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Grapheme-to-Phoneme Conversion for Chinese Text-to-Speech

Jun Xu (1), Guohong Fu (2), Haizhou Li (3)

(1) InfoTalk Technology, Singapore
(2) The Univ. of Hong Kong, Hong Kong
(3) Institute for Infocomm Research, Singapore

This paper reports a study of grapheme-to-phoneme (G2P) conversion for Chinese text-to-speech (TTS) system. As Chinese is a syllabic language, syllable is commonly adopted as the phonetic unit in TTS, which is represented by pinyin, the standard Chinese romanization. A Chinese G2P conversion is to find correct pinyin for polyphonic graphemes in the input text. In this paper, a complete G2P framework is presented, which includes a two-stage statistical word segmentation module, a hidden Markov model (HMM) based part-of-speech (POS) tagging module and a word-to-pinyin conversion module. In the word-to-pinyin conversion, a word grapheme is augmented by its POS tag in an effort to resolve the pronunciation disambiguation in G2P. The G2P experiments show that the polyphone G2P accuracy is improved by 9.41% after introducing POS module and further improved by 1.39% while applying the proposed word-to-pinyin method.

Full Paper

Bibliographic reference.  Xu, Jun / Fu, Guohong / Li, Haizhou (2004): "Grapheme-to-phoneme conversion for Chinese text-to-speech", In INTERSPEECH-2004, 1885-1888.