ISCA Archive ISCSLP 2008
ISCA Archive ISCSLP 2008

Heteronym Verification for Mandarin Speech Synthesis

Heng Lu, Zhen-Hua Ling, Si Wei, Yu Hu, Li-Rong Dai, Ren-Hua Wang

Accurate phonetic transcription of speech corpus is critical to high quality speech synthesis. In Mandarin text-to-speech (MTTS) system, one major problem of automatically labeling the database is the heteronym annotation. Because in Mandarin, there are some single-character words or multi-character words have more than one pronunciation. In this paper, a heteronym annotation verification method for MTTS database labeling is proposed. By training contextual dependent HMMs and calculating the log likelihood ratio, each heteronym in the database is assigned a confidence score and those below the threshold are selected for manual inspecting. We divide heteronyms in Mandarin into two categories and different features are used for each category. The result of our experiment on an artificial test set has shown that we can achieve EER (equal error rate) of 7.9% and 11.9% for these two categories. Further test on an actual database which contains a total of 36098 heteronyms has shown that the proposed method can find 89 of all 123 annotation errors by only inspecting 639 polyphones. Index Terms— heteronym annotation, log likelihood ratio, MTTS, , automatic labeling

Cite as: Lu, H., Ling, Z.-H., Wei, S., Hu, Y., Dai, L.-R., Wang, R.-H. (2008) Heteronym Verification for Mandarin Speech Synthesis. Proc. International Symposium on Chinese Spoken Language Processing, 137-140

  author={Heng Lu and Zhen-Hua Ling and Si Wei and Yu Hu and Li-Rong Dai and Ren-Hua Wang},
  title={{Heteronym Verification for Mandarin Speech Synthesis}},
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},