This paper presents a method that the dependency between F0 and spectral features are modeled for the HMM-based parametric speech synthesis system. In conventional systems these two features are modeled as two independent streams, which is inconsistent with the fact that there always exists interaction between the extracted F0 and spectral parameters for model training. A piecewise linear transform is introduced in this paper to explicitly model the dependency of spectrum on F0. The results of our experiments show that the proposed method is able to improve the accuracy of spectral parameter prediction if the F0 features are predicted based on a reliable voicing decision. Index Terms— speech synthesis, hidden Markov model, STRAIGHT, cross-stream dependency, linear transform
Cite as: Ling, Z.-H., Zhang, W., Wang, R.-H. (2008) Cross-Stream Dependency Modeling for HMM-based Speech Synthesis. Proc. International Symposium on Chinese Spoken Language Processing, 5-8
@inproceedings{ling08_iscslp, author={Zhen-Hua Ling and Wei Zhang and Ren-Hua Wang}, title={{Cross-Stream Dependency Modeling for HMM-based Speech Synthesis}}, year=2008, booktitle={Proc. International Symposium on Chinese Spoken Language Processing}, pages={5--8} }