Speech Prosody 2004
We examined the extent of material required to build prosodic models for duration, fundamental frequency and intensity. 50 multiple linear regression models were built for two MARSEC speakers on the basis of 70 utterances (7522 and 7643 segments). Models based on 8 and 20 utterances showed closeness of fits comparable to those reported by other researchers for much larger corpora. Little systematic improvement was seen beyond 20 utterances. A predictor ranking procedure advantageously replaced the more commonly used regression trees. Results suggest that a series of well-adapted small-footprint models provide more accurate information about the individual use of prosody in specific speech situations than a single model based on abundant data.
Bibliographic reference. Keller, Eric / Zellner-Keller, Brigitte (2004): "Optimal footprint for prosodic modelling", In SP-2004, 463-466.