ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

An F0 contour control model for totally speaker driven text to speech system

Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine

Totally Speaker Driven Text to Speech System produces high quality and natural speech resembling the acoustic and prosodic characteristics of the original speech corpus. In the F0 contour control of this system, an F0 contour of a whole sentence is produced by concatenating segmental F0 contours generated by modifying vectors that are representatives of typical F0 contours. The representative vectors are selected from the F0 contour codebook, which is designed so as to minimize the approximation error between F0 contours generated by the proposed model and real F0 contours extracted from a speech corpus. It was confirmed by experiments with Japanese speech corpus that F0 contours can be modeled with small approximation errors by only 48 representative vectors, and the synthetic speech sounded very natural and resembled the prosodic characteristics of the original speaker.


doi: 10.21437/ICSLP.1998-29

Cite as: Kagoshima, T., Morita, M., Seto, S., Akamine, M. (1998) An F0 contour control model for totally speaker driven text to speech system. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0214, doi: 10.21437/ICSLP.1998-29

@inproceedings{kagoshima98_icslp,
  author={Takehiko Kagoshima and Masahiro Morita and Shigenobu Seto and Masami Akamine},
  title={{An F0 contour control model for totally speaker driven text to speech system}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0214},
  doi={10.21437/ICSLP.1998-29}
}