Speech Prosody 2004

Nara, Japan
March 23-26, 2004

Automatic Determination of Phrase Breaks for Argentine Spanish

Humberto M. Torres, Jorge A. Gurlekian

Laboratorio de Investigaciones Sensoriales, CONICET, Universidad de Buenos Aires, Argentina

This work evaluates the efficiency of different word classes -part of speech-, normalized vs. non normalized counting for syllable and word occurrences, to predict non orthographic breaks of an Argentine Spanish database, designed for the development of the prosody component for a Text To Speech system. Within a set of 741 sentences, regression trees were trained and tested with two different proportions of data. The results show an error range of 8 to 15% whose minimum value is related to a reduced amount of morphologic categories, and a normalized counting of syllables and words.

