ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis

Diane Hirschfeld

The evaluation of spectral fit at the concatenation point is esseiltial for the generation of smooth and natural sounding speech in concatenative speech synthesis. Spectral features of a speech unit are mainly influenced by its segmental structure and its segmental context. Duration plays a key role as well. In order to concatenate units without audible signal discontinuities, a distance feature is needed for the evaluation of the concatenation quality which adequately models the named influences.

This paper presents a new distance feature for use in the unit selection of a concatenative synthesiser - the articulation function. This feature is able to model coarticulation processes and their acoustical results adequately. The modelling power of this feature will be compared to symbol-based segmental features conventionally used in cost function evaluation.


Cite as: Hirschfeld, D. (2000) Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 435-438

@inproceedings{hirschfeld00b_icslp,
  author={Diane Hirschfeld},
  title={{Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 435-438}
}