Fifth ISCA ITRW on Speech Synthesis

June 14-16, 2004
Pittsburgh, PA, USA

Clustering of Foot-Based Pitch Contours in Expressive Speech

Esther Klabbers, Jan P. H. van Santen

Center for Spoken Language Understanding, OGI School of Science & Engineering, Oregon Health & Science University, Beaverton, OR, USA

Intonation generation is still one of the weak links in the text-to-speech synthesis chain. It is a hard enough task to generate expressively neutral pitch contours, with accurate placement of accents and phrase boundaries, but to generate appropriate intonation for expressive speech is even more of a challenge. This paper is a first attempt at describing and categorizing the variation in pitch contours that occur in expressive speech, which is a necessary step in the development of a new intonation model for expressive speech. The analysis is performed in the framework of the Generalized Linear Alignment model [10]. A hierarchical clustering technique of foot-based pitch contours revealed some interesting phenomena. Apart from the standard declining phrase curve, we observed phrase curves consisting of an incline, an optional plateau and a decline. These phrase curves are often observed on the last two feet making up a minor or major phrase. In addition, the continuation rise that is associated with marking the end of a minor phrase, only occurred in about 10% of the cases.

Full Paper

Bibliographic reference.  Klabbers, Esther / Santen, Jan P. H. van (2004): "Clustering of foot-based pitch contours in expressive speech", In SSW5-2004, 73-78.