As large spoken language corpora become available, we revisit previous analyses based on smaller datasets and verify whether the conclusions generalise to the new data. In this paper, we present an analysis of speaking style variation in French, based on a large-scale corpus (450 hours, 2500 speakers), and compare it with previous analyses that were based on smaller corpora. The corpus is segmented at the phonetic, syllabic and word level; automated annotation in parts-of-speech and syntactic dependencies was performed, enhancing existing annotations; and a multitude of acoustic and prosodic features were automatically extracted. Statistical analysis (clustering, PCA) is performed to explore the characteristics of speaking styles, individual variation, and the discriminatory power of different sets of prosodic and linguistic features. We present a framework for modelling the relationship between prosodic units and syntactic units, on various levels of granularity; this framework is based on Universal Dependencies for syntax and on the acoustic correlates of prosodic boundaries, and can thus be generalised to multiple languages. We finally explore congruencies and mismatches between prosodic and syntactic boundaries, across speaking styles.
Cite as: Christodoulides, G. (2020) Speaking Style Prosodic Variation and the Prosody-Syntax Interface: A Large-Scale Corpus Study. Proc. Speech Prosody 2020, 705-709, doi: 10.21437/SpeechProsody.2020-144
@inproceedings{christodoulides20_speechprosody, author={George Christodoulides}, title={{Speaking Style Prosodic Variation and the Prosody-Syntax Interface: A Large-Scale Corpus Study}}, year=2020, booktitle={Proc. Speech Prosody 2020}, pages={705--709}, doi={10.21437/SpeechProsody.2020-144} }