Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Estimation of Duration Models for Phonemes in M exican Speech Synthesis

Horacio Meza Escalona, Ingrid Kirschning, Ofelia Cervantes Villagómez

Tlatoa Group, ICT, CENTIA, Universidad de las Américas, Puebla, Mexico

Voice is the most used communication media. For this reason, the voice is the natural media in the human-computer interaction and it is more used nowadays. A person does not need specialized training to use a computer when using voice as media. It is of general interest the study of the speech and even for trying to construct machines or systems to produce speech automatically.

This work shows the results of the conducted measures to obtain duration models that can get up in a Spanish synthesis system like Festival. In addition, The characteristics of a corpora used in the experimentation are described too. This work shows the results obtained by experimenting with isolated phonemes, as well as phonemes with left context, right context and both. Finally, there is a discussion on the results that illustrate the improvements obtained in the synthesis system using the new models of duration and some other works to future also set out.

Full Paper

Bibliographic reference.  Escalona, Horacio Meza / Kirschning, Ingrid / Villagómez, Ofelia Cervantes (2000): "Estimation of duration models for phonemes in m exican speech synthesis", In ICSLP-2000, vol.1, 685-688.