This paper reports an investigation of features relevant for classifying two speaking styles, namely, conversational speaking style and clear (e.g. hyper-articulated) speaking style. Spectral and prosodic features were automatically extracted from speech and classified using decision tree classifiers and multilayer perceptrons to achieve accuracies of about 71% and 77% respectively. More interestingly, we found that out of the 56 features only about 9 features are needed to capture the most predictive power. While perceptual studies have shown that spectral cues are more useful than prosodic features for intelligibility , here we find prosodic features are more important for classification.
Bibliographic reference. Amano-Kusumoto, Akiko / Hosom, John-Paul / Shafran, Izhak (2009): "Classifying clear and conversational speech based on acoustic features", In INTERSPEECH-2009, 1735-1738.