Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Comparing Static and Dynamic Features for Segmental Cost Function Calculation in Concatenative Speech Synthesis

Diane Hirschfeld

Laboratory for Acoustics and Speech Communication, Dresden University of Technology, Germany

The evaluation of spectral fit at the concatenation point is esseiltial for the generation of smooth and natural sounding speech in concatenative speech synthesis. Spectral features of a speech unit are mainly influenced by its segmental structure and its segmental context. Duration plays a key role as well. In order to concatenate units without audible signal discontinuities, a distance feature is needed for the evaluation of the concatenation quality which adequately models the named influences.

This paper presents a new distance feature for use in the unit selection of a concatenative synthesiser - the articulation function. This feature is able to model coarticulation processes and their acoustical results adequately. The modelling power of this feature will be compared to symbol-based segmental features conventionally used in cost function evaluation.


Full Paper

Bibliographic reference.  Hirschfeld, Diane (2000): "Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis", In ICSLP-2000, vol.2, 435-438.