We investigated the features reflecting utterance structure and disfluency profile to
improve the automated scoring of spontaneous speech responses
by non-native speakers of English. On both human annotated structural events (SEs), e.g.,
clause structure and disfluencies, and automatically detected
SEs on speech transcriptions, several features were derived and showed promisingly high
correlations to the human proficiency scores. However, the
usefulness of these SE-derived features on ASR hypotheses was still unknown.
In this paper, we reported our studies related to the detection of SEs from noisy ASR outputs and the application of the detected SEs for automated speech scoring. We found that clause boundary (CB) detection was impacted much less compared to interruption point (IP) (of speech disfluencies) detection when facing ASR errors. Next, several features derived from the detected SEs were evaluated by considering their correlation to human scores and their relative importance in a linear regression model.
Bibliographic reference. Chen, Lei / Yoon, Su-Youn (2012): "Application of structural events detected on ASR outputs for automated speaking assessment", In INTERSPEECH-2012, 767-770.