ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

Long-range prosody prediction and rhythm

Greg Kochanski, Anastassia Loukina, Elinor Keane, Chilin Shih, Burton Rosner

Rhythm is expressed by recurring, hence predictable, beat patterns. Poetry in many languages is composed with attention to poetic meters while prose is not. Therefore, one way to investigate speech rhythm is to evaluate how prose reading differs from poetry reading via a quantitative method that measures predictability.

We use linear regression to predict the acoustic properties of segments from the properties of up to 7 preceding segments. This explains as much as 41% of the variance in our full (prose) corpus and up to 79% in a sub-corpus of poetry. While roughly half of the predictive power comes from the segment immediately preceding the target, the predicted variance increases by 6% (for the full/prose corpus) or by 25% (for the poetry sub-corpus) upon extending the predictor to include the seven preceding segments. Therefore, interactions between segments extend well beyond the immediate vicinity. Potentially, these longer-range regressions capture the rhythms of the poetry. This approach could form a useful method for characterizing the statistical properties of spoken language, especially in reference to prosody and speech rhythm.

Index Terms: poetry, rhythm, prosody, syllable, prediction.

Cite as: Kochanski, G., Loukina, A., Keane, E., Shih, C., Rosner, B. (2010) Long-range prosody prediction and rhythm. Proc. Speech Prosody 2010, paper 222

  author={Greg Kochanski and Anastassia Loukina and Elinor Keane and Chilin Shih and Burton Rosner},
  title={{Long-range prosody prediction and rhythm}},
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 222}