Second Workshop on Child, Computer and Interaction (WOCCI 2009)

Cambridge, MA, USA
November 5, 2009

A Review of ASR Technologies for Children’s Speech

Matteo Gerosa (1), Diego Giuliani (1), Shrikanth Narayanan (2), Alexandros Potamianos (3)

(1) FBK Fondazione Bruno Kessler, Povo (TN), Italy
(2) SAIL Lab, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
(3) Dept. of Electronics and Computer Engineering, Tech. Univ. of Crete, Chania, Greece

In this paper, we review: (1) the acoustic and linguistic properties of children’s speech for both read and spontaneous speech, and (2) the developments in automatic speech recognition for children with application to spoken dialogue and multimodal dialogue system design. First, the effect of developmental changes on the absolute values and variability of acoustic correlates is presented for read speech for children ages 6 and up. Then, verbal child-machine spontaneous interaction is reviewed and results from recent studies are presented. Age trends of acoustic, linguistic and interaction parameters are discussed, such as sentence duration, filled pauses, politeness and frustration markers, and modality usage. Some differences between child-machine and humanhuman interaction are pointed out. The implications for acoustic modeling, linguistic modeling and spoken dialogue system design for children are presented. We conclude with a review of relevant applications of spoken dialogue technologies for children.

Full Paper

Bibliographic reference.  Gerosa, Matteo / Giuliani, Diego / Narayanan, Shrikanth / Potamianos, Alexandros (2009): "A review of ASR technologies for children’s speech", In WOCCI-2009, 89-96.