12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Where Should Pitch Accents and Phrase Breaks Go? A Syntax Tree Transducer Solution

Joseph Tepperman, Emily Nava

Rosetta Stone Labs, USA

Motivated by a desire to assess the prosody of foreign language learners, this study demonstrates the benefit of high-level syntactic information in automatically deciding where phrase breaks and pitch accents should go in text. The connection between syntax and prosody is well-established, and naturally lends itself to tree-based probabilistic models. With automatically-derived parse trees paired to tree transducer models, we found that categorical prosody tags for unseen text can be determined with significantly higher accuracy than they can with a baseline method that uses n-gram models of part-of-speech tags. On the Boston University Radio News Corpus, the tree transducer outperformed the baseline by 14% overall for accents, and by 3% overall for breaks. These automatic results fell within this corpus's range of inter-speaker agreement in assigning accents and breaks to text.

Full Paper

Bibliographic reference.  Tepperman, Joseph / Nava, Emily (2011): "Where should pitch accents and phrase breaks go? a syntax tree transducer solution", In INTERSPEECH-2011, 1353-1356.