Speech Prosody 2004
We analyze the distribution of ToBI labels in a corpus collected from a professional speaker for use in concatenative speech synthesis. Our goals include using such statistics to aid automatic ToBI labeling of such a corpus, analogously to how a language model aids speech recognition. We find that the professional speaker produces a rich variety of prosodic events. ToBI labels occur with skewed frequencies, with a trigram model for occurrences of 34 ToBI labels yielding a perplexity of 3.23, indicating that such statistics will likely aid recognition of those prosodic categories. We relate ToBI label occurrence to sentence type and word frequency, determining patterns which confirm that text information would also useful to such a recognizer.
Bibliographic reference. Pitrelli, John F. (2004): "ToBI prosodic analysis of a professional speaker of American English", In SP-2004, 557-560.