In this paper we describe the implementation of the duration rules for a Spanish text-to-speech synthesizer called AMIGO. Durations of the phonemes are generated with a simple multiplicative model, in which a base duration value specific for each phoneme is modified by a series of multiplicative coefficients that depend on the context. In some cases, instead of the value predicted by this model, a minimum duration is applied. For the construction of the rules, the factors that are relevant in duration patterns (stress, left and right context, position within the phrase, etc) and their weights were determined first, through the study of an acoustic database containing more than 10,000 labelled spoken phonemes.
Cite as: Macarron, A., Escalada, G., Rodriguez, M.A. (1991) Generation of duration rules for a Spanish text-to-speech synthesizer. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 617-620, doi: 10.21437/Eurospeech.1991-152
@inproceedings{macarron91_eurospeech, author={Alejandro Macarron and Gregorio Escalada and Miguel Angel Rodriguez}, title={{Generation of duration rules for a Spanish text-to-speech synthesizer}}, year=1991, booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)}, pages={617--620}, doi={10.21437/Eurospeech.1991-152} }