ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Asynchronous F0 and spectrum modeling for HMM-based speech synthesis

Cheng-Cheng Wang, Zhen-Hua Ling, Li-Rong Dai

This paper proposes an asynchronous model structure for fundamental frequency(F0) and spectrum modeling in HMM-based parametric speech synthesis to improve the performance of F0 prediction. F0 and spectrum features are considered to be synchronous in the conventional system. Considering that the production of these two features is decided by the movement of different speech organs, an explicitly asynchronous model structure is introduced. At training stage, F0 models are training asynchronously with spectrum models. At synthesis stage, the two features are generated respectively. The objective and subjective evaluation results show the proposed method can effectively improve the accuracy of F0 prediction.

doi: 10.21437/Interspeech.2009-136

Cite as: Wang, C.-C., Ling, Z.-H., Dai, L.-R. (2009) Asynchronous F0 and spectrum modeling for HMM-based speech synthesis. Proc. Interspeech 2009, 404-407, doi: 10.21437/Interspeech.2009-136

  author={Cheng-Cheng Wang and Zhen-Hua Ling and Li-Rong Dai},
  title={{Asynchronous F0 and spectrum modeling for HMM-based speech synthesis}},
  booktitle={Proc. Interspeech 2009},