ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Perceptually-based data-driven join costs: comparing join types

Ann K. Syrdal, Alistair D. Conkie

Unit selection synthesis has improved the quality of synthetic speech by making it possible to concatenate speech from a large database to produce intelligible synthesis while preserving much of the naturalness of the original signal. Such synthesis is by no means perfect, however, and this paper describes work to achieve more optimal joins between concatenated units. Results from a psychoacoustic experiment, acoustic parameters and phonetic factors are analyzed and used in statistical training of join costs so that audible discontinuities at concatenation boundaries can be minimized.

doi: 10.21437/Interspeech.2005-620

Cite as: Syrdal, A.K., Conkie, A.D. (2005) Perceptually-based data-driven join costs: comparing join types. Proc. Interspeech 2005, 2813-2816, doi: 10.21437/Interspeech.2005-620

  author={Ann K. Syrdal and Alistair D. Conkie},
  title={{Perceptually-based data-driven join costs: comparing join types}},
  booktitle={Proc. Interspeech 2005},