INTERSPEECH 2004 - ICSLP
This paper reports a study of grapheme-to-phoneme (G2P) conversion for Chinese text-to-speech (TTS) system. As Chinese is a syllabic language, syllable is commonly adopted as the phonetic unit in TTS, which is represented by pinyin, the standard Chinese romanization. A Chinese G2P conversion is to find correct pinyin for polyphonic graphemes in the input text. In this paper, a complete G2P framework is presented, which includes a two-stage statistical word segmentation module, a hidden Markov model (HMM) based part-of-speech (POS) tagging module and a word-to-pinyin conversion module. In the word-to-pinyin conversion, a word grapheme is augmented by its POS tag in an effort to resolve the pronunciation disambiguation in G2P. The G2P experiments show that the polyphone G2P accuracy is improved by 9.41% after introducing POS module and further improved by 1.39% while applying the proposed word-to-pinyin method.
Bibliographic reference. Xu, Jun / Fu, Guohong / Li, Haizhou (2004): "Grapheme-to-phoneme conversion for Chinese text-to-speech", In INTERSPEECH-2004, 1885-1888.