ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

Capturing inter-speaker invariance using statistical measures of rhythm

Tae-Jin Yoon

Statistical rhythmic metrics are applied on a Buckeye corpus [1] of spontaneous interview speech in order to investigate the extent of rhythm variability of between-speakers as well as the variability of within-speaker. The corpus consists of speech produced by speakers who share the same regional dialect in North America. The Buckeye corpus is unique in that the speech dataset is obtained from the speakers who have been raised in the same region and hence who share the same dialect from each other. Statistical measures of rhythm metrics are obtained from each of 10 speakers. The results show that the rhythmic measures that capture the least dialectal variance is the normalized pair-wise variability indices calculated based on adjacent consonantal duration and vocalic duration. The finding implies that these statistical measures of rhythm can be used in capturing the dialectal similarities.

Index Terms: speech rhythm, Buckeye corpus, rhythm metrics, rhythmic variability of between-speakers, rhythmic variability of within-speaker

Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E. and Fosler-Lussier, E. (2007) Buckeye Corpus of Conversational Speech (2nd rel.) [] Columbus, OH: Department of Psychology, Ohio State University (Distributor).

Cite as: Yoon, T.-J. (2010) Capturing inter-speaker invariance using statistical measures of rhythm. Proc. Speech Prosody 2010, paper 201

  author={Tae-Jin Yoon},
  title={{Capturing inter-speaker invariance using statistical measures of rhythm}},
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 201}