Sixth International Conference on Spoken Language Processing
October 16-20, 2000
Comparing Static and Dynamic Features for Segmental Cost Function Calculation in Concatenative Speech Synthesis
Laboratory for Acoustics and Speech Communication,
Dresden University of Technology, Germany
The evaluation of spectral fit at the concatenation point is
esseiltial for the generation of smooth and natural sounding
speech in concatenative speech synthesis. Spectral features of a
speech unit are mainly influenced by its segmental structure
and its segmental context. Duration plays a key role as well. In
order to concatenate units without audible signal discontinuities, a
distance feature is needed for the evaluation of the
quality which adequately models the named influences.
This paper presents a new distance feature for use in the unit
selection of a concatenative synthesiser - the articulation function.
This feature is able to model coarticulation processes and their
acoustical results adequately. The modelling power of this feature
will be compared to symbol-based segmental features
conventionally used in cost function evaluation.
Hirschfeld, Diane (2000):
"Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis",
In ICSLP-2000, vol.2, 435-438.