INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Phrase Boundary Assignment from Text in Multiple Domains

Andrew Rosenberg (1), Raul Fernandez (2), Bhuvana Ramabhadran (2)

(1) Computer Science Department, Queens College (CUNY), New York, NY, USA
(2) IBM TJ Watson Research Lab, Yorktown Heights, NY, USA

Detecting and modeling proper phrasing from an input text string is an important aspect when producing synthesis that sounds intelligible and natural. Knowledge of proper phrase structure influences, e.g., the placement and length of pauses, and the realization of phrase-final boundary contours, both of which can have an effect in a listener's percepts ranging from naturalness to semantic interpretation. In this work, we look at modeling the occurrence, and types, of phrase breaks from purely textual features, paying close attention to how the performance of the systems generalizes in- and out-of-domain for corpora of various types (such as broadcast news, spontaneous speech, and synthesis databases), and as a function of various subsets of syntactical and lexical features investigated.

Index Terms: Prosody Modeling, Prosodic Assignment, Speech Synthesis

Full Paper

Bibliographic reference.  Rosenberg, Andrew / Fernandez, Raul / Ramabhadran, Bhuvana (2012): "Phrase boundary assignment from text in multiple domains", In INTERSPEECH-2012, 2558-2561.