ISCA Archive ECST 1987
ISCA Archive ECST 1987

From segmental synthesis to acoustic rules using time dependent modeling techniques

Gerard Chollet, Gunnar Ahlbom, Frederic Bimbot, Alvaro De Lima-Veiga

Intelligible Text-to-Speech may be achieved by concatenating spectrally encoded segments. However, its lack of naturalness could be attributed to a difficult control of speech parameters. Acoustic rules are more adequate for this control. The aim of this work is to provide a methodology to move from a segmental to a rule-based approach. A number of interactive tools is proposed using powerful signal and data analysis techniques for modeling spectral evolution, inferring spectral targets, and generating adequate transitions between these targets. The choice of adequate spectral parameters is essential. A set of French speech segments ("polysons") of a single speaker has been encoded using these tools. Spectral targets were constrained to belong to a finite set of vectors (allophonic targets). Coarticulation effects (vowel reduction, nasalisation...) can be accounted for by controlling the time duration of temporal evolution functions. Segment concatenation problems are eliminated. Automatic procedures to select allophonic targets for new speakers and group temporal patterns into rules are the current issues.


Cite as: Chollet, G., Ahlbom, G., Bimbot, F., Lima-Veiga, A.D. (1987) From segmental synthesis to acoustic rules using time dependent modeling techniques. Proc. European Conference on Speech Technology, 2389-2392

@inproceedings{chollet87_ecst,
  author={Gerard Chollet and Gunnar Ahlbom and Frederic Bimbot and Alvaro De Lima-Veiga},
  title={{From segmental synthesis to acoustic rules using time dependent modeling techniques}},
  year=1987,
  booktitle={Proc. European Conference on Speech Technology},
  pages={2389--2392}
}