Segmental duration and segmental power control factors were statistically analyzed for Japanese speech synthesis using a large sentence speech database. Through these analyses, prosodic characteristics of segmental duration control and segmental power control were compared. Large differences were found in factors such as the neighboring phoneme, the intensity of fundamental frequency and the range of utterance group final positions. It has also been confirmed that segmental duration and segmental power were accurately predicted by the linear model used in our statistical analysis. Keywords: Speech synthesis, Segmental duration, Power, Synthesis by rule, Prosody control
Cite as: Kaiki, N., Mimura, K., Sagisaka, Y. (1991) Statistical modeling of segmental duration and power control for Japanese. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 625-628, doi: 10.21437/Eurospeech.1991-154
@inproceedings{kaiki91_eurospeech, author={Nobuyoshi Kaiki and Katsuhiko Mimura and Yoshinori Sagisaka}, title={{Statistical modeling of segmental duration and power control for Japanese}}, year=1991, booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)}, pages={625--628}, doi={10.21437/Eurospeech.1991-154} }