ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning

Yi Shi, Congyi Wang, Yu Chen, Bin Wang

The majority of Chinese characters are monophonic, while a special group of characters, called polyphonic characters, have multiple pronunciations. As a prerequisite of performing speech-related generative tasks, the correct pronunciation must be identified among several candidates. This process is called Polyphone Disambiguation. Although the problem has been well explored with both knowledge-based and learning-based approaches, it remains challenging due to the lack of publicly available labeled datasets and the irregular nature of polyphone in Mandarin Chinese. In this paper, we propose a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data. We explore the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling. Qualitative and quantitative experiments demonstrate that our method achieves state-of-the-art performance. In addition, we publish a novel dataset specifically for the polyphone disambiguation task to promote further researches.

doi: 10.21437/Interspeech.2021-502

Cite as: Shi, Y., Wang, C., Chen, Y., Wang, B. (2021) Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning. Proc. Interspeech 2021, 4109-4113, doi: 10.21437/Interspeech.2021-502

  author={Yi Shi and Congyi Wang and Yu Chen and Bin Wang},
  title={{Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning}},
  booktitle={Proc. Interspeech 2021},