Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Comparing Spectral Distance Measures for Join Cost Optimization in Concatenative Speech Synthesis

Ingmund Bjørkan, Torbjørn Svendsen, Snorre Farner

Norwegian University of Science & Technology, Norway

In concatenative synthesis the join cost function can be related to the probability of a perceived discontinuity at the join. Therefore it is important that the distance measures in the cost function correlate highly with human perceived discontinuities. In this paper the results of a listening test on joins in two Norwegian long vowels: /A:/ and /e:/, is presented. Five spectral distance measures and the F0 difference are compared as predictors of the human perceived discontinuities using Receiver Operating Characteristic (ROC) curves. In addition, a linear join cost function is optimized by means of stepwise linear regression.

Full Paper

Bibliographic reference.  Bjørkan, Ingmund / Svendsen, Torbjørn / Farner, Snorre (2005): "Comparing spectral distance measures for join cost optimization in concatenative speech synthesis", In INTERSPEECH-2005, 2577-2580.