Speech Prosody 2004

Nara, Japan
March 23-26, 2004

ToBI Prosodic Analysis of a Professional Speaker of American English

John F. Pitrelli

IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

We analyze the distribution of ToBI labels in a corpus collected from a professional speaker for use in concatenative speech synthesis. Our goals include using such statistics to aid automatic ToBI labeling of such a corpus, analogously to how a language model aids speech recognition. We find that the professional speaker produces a rich variety of prosodic events. ToBI labels occur with skewed frequencies, with a trigram model for occurrences of 34 ToBI labels yielding a perplexity of 3.23, indicating that such statistics will likely aid recognition of those prosodic categories. We relate ToBI label occurrence to sentence type and word frequency, determining patterns which confirm that text information would also useful to such a recognizer.

