Sixth European Conference on Speech Communication and Technology
This paper addresses the question of how children between the ages of nine and eleven perceive and respond to prosodic variation in speech synthesis. Prosodic features were varied in samples of both concatenative and formant synthesis. Children and an adult control group were asked to compare these samples and to evaluate which were the most fun and which were the most natural. Results indicate that children perceive prosodic differences in the synthesis examples and prefer large manipulations in F0 and duration when a fun voice is intended. Even for naturalness, the children often prefer larger manipulations in F0 than are present in the default versions of the synthesis. These results can have implications for the implementation of synthesis in the context of a commercial computer program for children and more widely for child-directed speech synthesis.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. House, David / Bell, Linda / Gustafson, Kjell / Johansson, Linn (1999): "Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program", In EUROSPEECH'99, 1843-1846.