A Mandarin Prosodic Boundary Prediction Model Based on Multi-Task Learning

Huashan Pan, Xiulin Li, Zhiqiang Huang


In this paper, we propose a mandarin prosodic boundary prediction model based on Multi-Task Learning (MTL) architecture. The prosody structure of mandarin is a three-level hierarchical structure, which contains three basic units — Prosodic Word (PW), Prosodic Phrase (PPH) and Intonational Phrase (IPH) [1]. Previous studies usually decompose mandarin prosodic boundary prediction task into three independent tasks on these three unit boundaries [1–4]. In recent years, with the development of deep learning, MTL has achieved state-of-the-art performance on many tasks in Natural Language Processing (NLP) field [5–7]. Inspired by this, this paper implements an MTL framework with Bidirectional Long-Short Term Memory and Conditional Random Field (BLSTM-CRF) as the basic model, and takes three independent tasks of mandarin prosodic boundary prediction as sub-modules for PW, PPH and IPH individually. Under the MTL architecture, the three independent tasks are unified for overall optimization. The experiment results show that our model is effective in solving the task of mandarin prosodic boundary prediction, in which the overall prediction performance is improved by 0.8%, and the model size is reduced by about 55%.


 DOI: 10.21437/Interspeech.2019-1400

Cite as: Pan, H., Li, X., Huang, Z. (2019) A Mandarin Prosodic Boundary Prediction Model Based on Multi-Task Learning. Proc. Interspeech 2019, 4485-4488, DOI: 10.21437/Interspeech.2019-1400.


@inproceedings{Pan2019,
  author={Huashan Pan and Xiulin Li and Zhiqiang Huang},
  title={{A Mandarin Prosodic Boundary Prediction Model Based on Multi-Task Learning}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={4485--4488},
  doi={10.21437/Interspeech.2019-1400},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1400}
}