ISCA Archive ISCSLP 2008
ISCA Archive ISCSLP 2008

Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis

Ming-Hui Dong, Hai-Zhou Li

We usually build a prosody model to predict the prosodic parameters, which will be used as part of the criteria for unit selection. Spectral appropriateness of units is usually ensured by using identities of context units, which are linguistic symbols. With looking into the spectral properties of the actual signal, the spectral mismatches are often perceived in the synthetic speech. In this paper, we propose to use MFCC as spectral parameters in addition to the prosodic parameters. By introducing the spectral parameters into the criteria for unit selection, the appropriateness of units can determined by statistical models. Thus the possibility of abnormal spectral mismatches between the concatenated units can be reduced. Experiments show that the approach helps to improve the quality of synthetic speech. Index Terms — Speech synthesis, unit selection, spectral and prosodic parameters, parameter prediction


Cite as: Dong, M.-H., Li, H.-Z. (2008) Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis. Proc. International Symposium on Chinese Spoken Language Processing, 133-136

@inproceedings{dong08_iscslp,
  author={Ming-Hui Dong and Hai-Zhou Li},
  title={{Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis}},
  year=2008,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={133--136}
}