Fifth ISCA ITRW on Speech Synthesis
June 14-16, 2004
Intonation generation is still one of the weak links in the text-to-speech synthesis chain. It is a hard enough task to generate expressively neutral pitch contours, with accurate placement of accents and phrase boundaries, but to generate appropriate intonation for expressive speech is even more of a challenge. This paper is a first attempt at describing and categorizing the variation in pitch contours that occur in expressive speech, which is a necessary step in the development of a new intonation model for expressive speech. The analysis is performed in the framework of the Generalized Linear Alignment model . A hierarchical clustering technique of foot-based pitch contours revealed some interesting phenomena. Apart from the standard declining phrase curve, we observed phrase curves consisting of an incline, an optional plateau and a decline. These phrase curves are often observed on the last two feet making up a minor or major phrase. In addition, the continuation rise that is associated with marking the end of a minor phrase, only occurred in about 10% of the cases.
Bibliographic reference. Klabbers, Esther / Santen, Jan P. H. van (2004): "Clustering of foot-based pitch contours in expressive speech", In SSW5-2004, 73-78.