Paralinguistic cues in children's speech convey the child's affective state and can serve as important markers for the early detection of autism spectrum disorder (ASD). In this paper, we detect paralinguistic events, such as laughter and fussing/crying, along with toddlers' speech from the Multi-modal Dyadic Behavior Dataset (MMDB). We use both spectral and prosodic acoustic features selected using a combination of filter and wrapper-based methods. The classification accuracy using a support vector machine with a linear kernel for detecting laughter in children's speech was 77.87% and that for fussing/crying was 79.37%. A tertiary classification scheme for detecting laughter, fussing/crying, and speech yielded an accuracy of 69.73%. To test for the generalization of the approach for detecting fussing/ crying, we used recordings from the Strange Situation protocol, which is used to observe attachment behavior between an infant and a parent. Using a cross-corpus testing set for detecting fussing/crying, we obtained a detection accuracy of 71.6%. These results indicate that the selected acoustic features are capable of discriminating children's laughter, fussing/crying, and speech and the algorithms generalize well to a dataset consisting of paralinguistic cues of a different age group, infants (1218 months of age), gathered in a different context.
Bibliographic reference. Rao, Hrishikesh / Kim, Jonathan C. / Clements, Mark A. / Rozga, Agata / Messinger, Daniel S. (2014): "Detection of children's paralinguistic events in interaction with caregivers", In INTERSPEECH-2014, 1229-1233.