ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Corpus-based generation of prosodic features from text based on generation process model

Keikichi Hirose, Keiko Ochi, Nobuaki Minematsu

A total scheme of generating prosodic features from a text input was constructed. The method consists of corpus-based prediction of pauses, phone durations and fundamental frequencies ((F0's), in this order, and information predicted in an earlier process is utilized in the following processes. Since prediction of F0's is done on the command values of F0 contour generation process model instead of direct F0 values, a stable and flexible control of F0 contours is possible. By adding constraints on the accent command timings as a post processing, a better quality was realized when speech was synthesized using prosodic features generated by the method. Validity of the developed method was confirmed through the listening test of the synthetic speech.


doi: 10.21437/Interspeech.2007-228

Cite as: Hirose, K., Ochi, K., Minematsu, N. (2007) Corpus-based generation of prosodic features from text based on generation process model. Proc. Interspeech 2007, 1274-1277, doi: 10.21437/Interspeech.2007-228

@inproceedings{hirose07_interspeech,
  author={Keikichi Hirose and Keiko Ochi and Nobuaki Minematsu},
  title={{Corpus-based generation of prosodic features from text based on generation process model}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1274--1277},
  doi={10.21437/Interspeech.2007-228}
}