The 1st Workshop on Child, Computer and Interaction (WOCCI2008)
Chania, Crete, Greece
The synthesis of child speech presents challenges both in the collection of data and in the building of a synthesiser from that data. Because only limited data can be collected, and the domain of that data is constrained, it is di 14;cult to ob- tain the type of phonetically-balanced corpus usually used in speech synthesis. As a consequence, building a synthe- siser from this data is di 14;cult. Concatenative synthesisers are not robust to corpora with many missing units (as is likely when the corpus content is not carefully designed), so we chose to build a statistical parametric synthesiser us- ing the HMM-based system HTS. This technique has pre- viously been shown to perform well for limited amounts of data, and for data collected under imperfect conditions. We compared 6 di 11;erent con 12;gurations of the synthesiser, us- ing both speaker-dependent and speaker-adaptive modelling techniques, and using varying amounts of data. The out- put from these systems was evaluated alongside natural and vocoded speech, in a Blizzard-style listening test.
Bibliographic reference. Watts, Oliver / Yamagishi, Junichi / Berkling, Kay / King, Simon (2008): "HMM-based synthesis of child speech", In WOCCI-2008, paper 19.